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Abstract 



In our preceding paper, we have proposed an algorithm for obtaining finite- 
norm solutions of higher-order linear ordinary differential equations of the 



with rational-number- valued coefficients), by using only the four arithmeti- 
cal operations on integers, and we proved its validity. For any nonnegative 
integer k, it is guaranteed mathematically that this method can produce all 
the solutions satisfying J \ f{x)\'^ {x^ + 1)^ dx < oo , under some conditions. 
We materialize this algorithm in practical procedures. An integer- type quasi- 
orthogonalization used there can suppress the explosion of calculations. More- 
over, we give an upper limit of the errors. We also give some results of numer- 
ical experiments and compare them with the corresponding exact analytical 
solutions, which show that the proposed algorithm is successful in yielding 
solutions with an extraordinarily high accuracy (using only arithmetical oper- 
ations on integers). 
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Since linear higher-order ordinary equations with general coefficient functions cannot 
be solved analytically except in very special simple cases [1] , numerical methods are 
useful in many applications, such as the eigenfunction problem for a differential 
operator on a complete function space "H. Many approaches with numerical methods 
have been proposed for this problem. One method is based on the approximation of 
solutions in a finite-dimensional subspace of the original function space "H [2] 

Using an idea somewhat similar to this approach, our preceding paper [3] proved 
the validity of our newly-proposed integer-type numerical method using smooth (i.e. 
analytical) basis functions which was able to obtain approximations of all the true 

solutions in Ti of the Fuchsian-type differential equation (^^^ ^mix) (^)'") /(•^) ~ ^ 

m=0 

with rational-function-type coefficient functions rm{x) using only the four arithmeti- 
cal operations on integers, for a class of complete function spaces H containing L^(R) 
as a special case. This algorithm can be applied even for the non-Fuchsian cases, 
under some restrictions. However, the method proposed in that paper is considerably 
different from the usual methods based on approximation in a finite-dimensional sub- 
space, such as the Ritz and Galerkin methods [2] [3] . The main differences from usual 
Galerkin methods were explained in the introduction of [1]. The method proposed 
in |4j can be regarded as a kind of 'semi- analytical method' rather than a purely 
numerical method, in that it is closely related to the functional analysis, Fourier 
series and the Laurent expansion of complex functions. In addition, as a remarkable 
characteristic of the proposed method, all the basis functions used there are rational 
functions of the coordinate which are related to a power series of the Cayley trans- 
form of the coordinate. This characteristic enables us to close all procedures in the 
method only within four arithmetic operations between rational numbers and hence 
between integers. These facts imply that the proposed method can be discussed from 
some viewpoints of mathematical analysis. 

This method is perfectly free from round-off error because it consists only of 
integer-type operations, when we choose function spaces and their basis systems 
appropriately. It has only two types of errors instead of round-off errors. One is 
the 'pure' truncation error due to the components outside the subspace (contained 
even in true exact solutions), and the other is the 'mixture error' due to the (slight) 
mixture of extra solutions not corresponding to true solutions in "H. However, as 
is proved in |1] , the latter mixture error converges to zero as the dimension of the 
subspace tends to infinity, with this method. Moreover, the 'pure' truncation error 
decays very rapidly in the proposed method, due to the relationship between the 
Fourier series and the expansion used in the proposed method. 

Moreover, this method requires a small amount of calculations for obtaining high- 
accuracy solutions. For example, when the coefficients in the expansion of a true solu- 
tion by the basis functions decay exponentially, the amount of calculations required 
by this method is almost proportional asymptotically to the cube of the number 
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of required significant digits. This is a strong advantage in comparison with the 
Runge-Kutta methods and the finite elements methods which require the amount 
of calculations with exponential orders of the required significant digits. From this 
advantage, as is shown later in numerical results, by the proposed method, we can 
easily attain the accuracy with several hundred or several thousand significant digits 
by an ordinary personal computer in many cases. 

Since the structure and the procedures of this method are somewhat complicated, 
in our preceding paper we explained only its abstract structure, its mathematical 
validity and a rough sketch of the procedures in the proposed algorithms. In this 
paper, we will explain the concrete materialization of this method, and will provide 
some theoretical analysis of the accuracy of this method. 

In the previous paper [1], we have not explained how to calculate the matrix 
elements of the band-diagonal matrix efficiently in a concrete discussion. In this 
paper, we will show that we can calculate them from a small number of coefficients 
which can be derived easily from the recursion formulae among the basis functions 
only by four arithmetic operations among integers when the coefficients in the ra- 
tional functions Pm.{x) {m = 0, 1, ... , M) are rational complex numbers. Moreover, 
we will explain the detailed procedure of a quasi-minimization of the ratio between 
two quadratic forms which is effective in removing extra solutions of the system of 
simultaneous linear equations, while it has been omitted in [4j . We will propose and 
illustrate in detail an integer-type quasi-orthogonalization, which is effective in the 
above quasi-minimization and requires a relatively small amount of calculations. 

The proof of the convergence to true solutions of the numerical solutions obtained 
by means of this quasi-orthogonalization is one of the main result of this paper. 
Moreover, we will prove the halting of this process. Another main result of this 
paper is the proof of an inequality which gives an upper bound of errors contained 
in numerical solutions obtained by the proposed method. 

The contents of this paper are as follows: In Section [21 we survey the basic 
structures of this method proposed and proved in [1]. In section[3l the function spaces 
and the basis systems used in this method are surveyed. In section HJ we explain 
how to calculate the 'matrix elements' of the matrix representation of a differential 
operator using only the coefficients of the differential operator. In Section [5l we 
specify the concrete procedures for the algorithm proposed only abstractly in [4] , for 
the cases where the space of true solutions in "H is one- dimensional. In Section |6l we 
prove that the realization proposed in Section E] satisfies the conditions required for 
the convergence of the mixture error to zero. In Section [TJ we extend the realization 
proposed in Section |5] to general cases where the space of obtainable true solutions 
in H is multi-dimensional, and we prove the convergence of the mixture error to 
zero even for this extension in Section |8l Since the procedures proposed in Section |5] 
contain iterations under given conditions, we prove that they halt in a finite number 
of steps in Section O In Section [TDl we will state a theoretical upper bound for the 
errors. In Section [TTl we give numerical results which clarifies how accurate results 
our method gives. Moreover, in that section, we compare theoretically the order of 
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the amount of required calculations between some typical existing methods and the 
proposed method. Finally, we present our conclusions. 

2 Survey of basic structure 

As was introduced in [1], we propose an integer-type algorithm for finding the 
C'^-solutions in a function space "H of a higher-order linear ordinary differential 
equation R{x, ^)/ = of Fuchsian type given by a differential operator R{x, ^) = 

M 

rm{x) (^)™' with rational functions rm{x) {m = 0, 1, M) of x with rational- 

(complex-) valued coefficients such that rjv/(±0 7^ ^^"^ coefficient functions 
rm{z) (m = 1, 2, . . . , M — 1) have no singularity at 2; = ±i. Even for non- Fuchsian 
cases, the proposed method can be applied with some restrictions [1] [5]. 

By multiplying the least common multiple of the denominators rm{x) {m = 
0, 1, M), this problem can be simplified, without loss of generality, into the prob- 

M 

lem to solve the ODE P(x,-^) = ^^pm(a;) (^)™' with polynomials Pm{x) {m = 

m=0 

0, 1, M) of X with rational-(complex-)valued coefficients such that pm(±^) 7^ 0. 

Moreover, for Fuchsian cases, as is shown in another paper [1], by multiplying 
an appropriately chosen polynomial, the above problem can be modified into the 

M 

problem to solve the ODE Q{x, ^) = qm{x) (^)™ with polynomials qm{x) (m = 

m=0 

0,1,. ..,M) of X with rational- (complex-) valued coefficients such that gM(±^) 7^ 
and the multiplicities of the zero points of qM{x) are not smaller than M. When 
Pm{x) has no zero point, without any conflict with the above condition about the 
multiplicities of zero points, we may set Q{x, ^) = P{x, ^). 

This method is based on the band-diagonal matrix representation of the operator 
Bq which is defined by the closure of the operator Bq : H — )■ Ti^ {Bf = Q{x, ^)/) 
with domain D{Bq) := {f e C^^ {R) nn\Q{x, £)f E H^}, where is a function 
space which contains (as a set) T-L, but which has a different norm, (■,-)^o, from 
the norm for H, (■, ■)^. Under the conditions C1-C4, CI"*" and C2^ below, we can 
show the validity of this band-diagonal matrix representation 6^ := (i^QCn, e^)^o 
with respect to orthonormal basis systems {en}nez+ and {e2}nez+ ^ 
respectively, i.e. a one-to-one correspondence is guaranteed between a solution / 
in C^(M \S) nT-L {S: set of singular points of the ODE) of the differential equation 
Q(x, £)f = and a vector f inVn e'^{Z+) with 

m+£o 

V:={f\J2blf^ = 0imeZ+)}, (1) 

n=max(0,m— fo) 
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oo 

where / = /„e„ The conditions required are: 

n=0 

CI There exists a CONS {e„ | n G Z+} of 7i such that e„ G D{Bq). 

C2 There exist an integer £o and a CONS {e^| n G Z+} of "H^ such that 
6:;^ := {BqCn, e^,)^o = when \n - m\ > Eq. 

C3 There exists a hnear operator Cq with domain D{Cq) from a dense subspace of 
to H such that G D{Cq) and (Eg/, e^)^o = (/, Cge^)^ for / G D{Bq). 

m+io 

C4 For any sequence {fn}'^=o ^ -^^ satisfying ^m/n = (m G Z^), the sum 

n=max(0,m— £o) 

AT 

/nCn converges (with respect to the "H-norm) to a solution / G C^^(]R) fl Ti 

n=0 

of Q(x,£)/ = OasiV-^oo. 

/oo 
-oo 

/oo 
f{x)g{x)v^{x)dx. 
-oo 

Especially when the differential equation has no singular points, Conditions C1+ 
and C2+ can be exempted, and then we can show the one-to-one correspondence 
between a solution / in C^(K) fl 7/ of the differential equation Q(x, -^)f = (the 
same as P(x, ^)/ = 0) and a vector / in n f{Z+) only from Conditions C1-C4. 

From the above discussions, in order to obtain true solutions of the ODE, we 
should extract only square-summable vector solutions of the system of simultane- 

m+lo 

ous linear equations ^m/n = (m G Z~^). This extraction can be made 

n=max(0,m— i?o) 

approximately by the method explained below when the following condition C5 is 
satisfied. 

C5 There exists an integer jo ^ Z+ such that b^'^^° ^ for any integer m > jo 
(m G Z+). 

In this case, the dimension D of is equal to that of 

po 

^PoV = {{fuVrilo \J2^lfn = (m = 0, 1, Jo - 1) }, (2) 

n=0 

where 

Po := Jo + io-l (3) 
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and the truncation operator 11^ is defined by 

\0 (n>m). 

In the following, for simplicity, we sometimes identify Hmf with the corresponding 
(m + l)-dimensional vector. 

For the extraction only of square-summable vector solutions, we choose a bounded 
bilinear form Q{f, g) on £^(Z"'") x £^(Z"'") (and the corresponding quadratic form 
:= /) on £^(Z+) ) and the integers K and satisfying 

oo 

£2(Z+), Q{f) > \\f\\% := J2 \fn\', (5) 

n=0 

iV>i^ > Jo + ^0-1, (6) 
and define the ratio and its minimum: 

:= for/GV\{0} (7) 

\\f\\%,K 

^^k!n ■■= .min crfk/)- (8) 

/ey\{0} 

Similarly, we define 

^^if) ■■= forfeV\{0} (9) 

\\f\\%,K 

^fl ■■= .min aflU). (10) 

/enw 

The proposed method yields all the vectors in a linear space Vk satisfying the fol- 
lowing condition: 



Vk C {{<Tf^^r'%ca%\) U {0} (11) 



The proposed method is materialized by means of a practical algorithm or finding 
a basis system of Vk satisfying ( fTTi) . This algorithm is based on the intermediate idea 
between the Gram-Schmidt orthogonalization and the Euclidean algorithm, which 
requires a relatively small amount of calculations. 

When the purpose is to calculate the truncated elements £^ solution liKiV fl 
£^(Z"'")), the error is evaluated by the norm concerning inner product {x,y)£2^K '■ = 
{IlKX,IlKy)e2. Denoting the the projection to W concerning this inner product by 
Pw,K, we can evaluate the accuracy of our result Vk for the worst case by 

\\Pv,kx - x\\e2^K 
sup sup — —. (12) 



It can be shown that this value goes to and all the true solutions of the ODE can 
be approximated by our solution space, from the following theorems 
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Theorem 2.1 (Theorem 2.5 of [1]) For fixed K , when N goes to infinity, the con- 
vergence 

sup 7-^ ■ > (13) 

holds. 

Theorem 2.2 (Theorem 2.6 of [4J) Assume that Q{x) = ||x||£2. When we choose 
sufficiently large numbers Nq and cq, then for any N > Nq and c> Cq, we have 

Pv,KVK = iiK{vnf{z-^)) (14) 

for any choice ofVx- 

Their proofs have been given in Section 4 of [1]. 

oo 

As a practical choice of the quadratic form Vl{f, g), we can use ^{f, g) = fnVnWn 

n=0 

with a non-decreasing 'weight number sequence' satisfying Wn = I ioi n < K and 
Wn = R for n > J with integers K and J such that j + io — 1<K<J<N. To 
enable discussion of the upper error bound, we limit the bilinear form to this class. 
In particular, the weight number sequence Wn used in numerical experiments of this 
study is 

1 in<K) 
:= { e^('^"-'^^) {K <n< J) (15) 
R ■= e'-{/^^-^'^) {n > N) 

_ kg + 1 ko + 1 



with fin : 

Empirically, the choice with r = 10^, K = 2[^J + ko and J = 2[^J + ko or 
K = 2[^J + ko and J = 2[i|^J + ko often gives good results, for example. This 
choice preserves the symmetry property of the basis system introduced below. The 
specification of the class of bilinear forms given in this paragraph is not related to 
the following sections, except for Sections [lO] and [HI 



3 Function spaces and basis systems used in the 
algorithm 

In this section, we will introduce the function spaces Ti, and their basis systems 
{e„| n G Z"*"}, {e^l n G Z"*"} which satisfy the conditions C1-C4 and C5. 
First, define the inner product and norm parametrized by A; G Z as 



f{x)W) {x^ + ^fdx and \\f\\(k) := \fix)fix' + l)'dx 

oo J — oo 
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Now we can introduce the Hilbert space of functions with inner product (/, g)(k) 

Lfk)0^) ■= {f ■ measurable \ \\f\\(k) < 00} . 

Then, Lf^){R) C if k > k. Note that Lf^^iR) = L'^{R). For the spaces n and 

Ti^, we choose 

n = LliR), n^ = Ll,iR) (16) 
with integers fco and satisfying k^ > and k9. <kQ — max (degpm ~ 

me{0,l,...,Af} 

Next, we will introduce the basis function systems. Define the wavepacket func- 
tions 

The indices of functions in {V'fccri | G Z} are bilaterally expressed, whereas the 
indices of basis functions in {e„}J^Q are unilaterally expressed, and they are 'matched' 
to one another by the one-to-one mapping defined by fi^ „ in f ll8p below. In order to 
avoid confusion between them, in this paper, the integer indices with double dots " 
denote the bilateral ones in Z, in contrast with the unilateral ones (without double 
dots) in Z+. The functions introduced in the above satisfy the symmetry property 

These are sinusoidal-like wavepackets with spindle-shaped envelopes. An example 
of the shapes of these wavepackets is illustrated in Figures [T] and |21 As is explained 
in Appendix [B] in detail, the wavepackets defined by ( II 7p are 'almost-sinusoidally' 
oscillating wavepackets with spindle-shaped envelopes \ilJk,n{,x) \ = (x^ + 1) ^ , when 
> 0, and their approximation to a sinusoidal wavepacket with Gaussian envelope 

^fe,n(^) ~ (const.) ■ e e"5^' . 

holds for sufficiently large k, in the sense that we can show the convergence 
lim II Sfc^n e * ^'ipk,n(^i — 6^2^^^ 11= ( S^. „ : const.) with respect to 

the L^-norm. This property shows the suitability of the wavepackets for expanding 
almost-localized smooth solutions. 

Moreover, as is shown in our preceding paper |5], they are related to the basis 
functions of Fourier series, under a change of variable x ^ 9 := 2arctanx used also 
in another field [7] [8]. 

These functions are used for the orthonormal basis systems {e„}^Q and {e^},^Q 
of H and T-L^, respectively, as follows: 

Define 

V V '■J ,n 

with hk,n := [-^J + (-1)"+'=+^ [^J . 
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Figure 1: Graph of ^'2,5(2;) Figure 2: Graph of ^5,3(0;) 

Under the choices (fT6|) and (ITSl) . we can show that the assumptions C1-C4 are 
satisfied [1]. Moreover, if qM^^i) 7^ 0, the assumption C5 is also satisfied [1]. We 
can show that 

io = 2M + ko- and jo = max(A;^, 0) (19) 

for Conditions C2 and C5 [4] [5]. In addition, if all the coefficients of the polynomials 
Qmix) (0 < m < M) belong to Q + Qi, then C7 is always satisfied [1]. 

Here we point out some properties of ^pk,h defined in (|T71) . which will be important 
later. 

Theorem 3.1 For any integer n, 

i^k,h{x) = -- - (20) 

XllJk,h{,x) = ^ (V^fc_l,ji(x) + 1pk~i^n+i{x)) (21) 

£ ipk,h{x) = fl %l)k+l,h-l{x) - {fi + k + l) il)k+i,n{x) . (22) 

This theorem can be derived directly from the definition of ipn^k- It is essential for 
the conditions C2 and C 5 as is shown in and it will be useful for showing how 
to calculate the matrix elements by integer-type programs in Section |H 

4 Matrix elements calculated using only the co- 
efficients of the polynomials in the differential 
equation 

In order to find solutions of the simultaneous linear equations b]^fn = (m G Z+), 
we have to determine the 'matrix elements' 6^ {m,n G \m — n\ < io) from the 
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differential operator Q(x, -£;). In this section, we explain how we can calculate these 
matrix elements from just the coefficients qmj (m = 0, 1, . . . M; j = 0, 1, . . . deggm) 

deg qm M 

of the polynomials qm{x) = QmjX^ in the differential operator Q'm(a;)(^)™ 

j=0 m=0 

by means of a recursive use of the properties fj20|) . fl2T]) and fl22|) of ^/'^.n- 

In |1] and in Section [2] of this paper, the matrix element h"^ is defined in a 
general framework only as the inner product {BQen,e^)~. However, because of 
these properties of 'ipk,h, we can determine from just the coefficients qmj {rn = 
0, 1, . . . M; j = 0, 1, . . . deggm), without any calculations of inner products. 

In order to explain this, we introduce the matrix elements 6^ (m, n G Z) in the 
'bilateral expression'. Define 

where the last equality is derived from the definitions in f lTSj) together with the 
'matching' between the unilateral and bilateral expressions. 

The function Q{x, ■£:)4'ko,h can be expressed as a linear combination of ipj^o ^ 
with m = n — M, n — M + l,...^h + M + kQ — k^, by the linear combination of the 
results of the recursive use of fl22|) m times, that of fl2T|) j times and that of fl20|) 
ko — + m — i times. From this fact, it is easily shown that the matrix elements 
e Z; 71 - M < m < h + M + ko - k^) can be written polynomial 
in ri whose order is not greater than M, under fixed k. For m<'n, — M — lor 
rh>h + M + ko-k^ + l, b% = 0. 

Since these polynomials are similar to one another for the matrix elements with 
common difference rh — n, under the representation of 6^ by a function of m — fi and 
h, the power series expansion 

M 

bi = J2i^^-n,sr^ (24) 

with coefficients Pr^g {s = 0,1,..., M) is convenient. Here note that Pr,s = for 
f < —M — 1 or f > M + ko-k^ + 1. Hence, we can calculate all the matrix elements 
6^ by the expansion dMD from just the (2M + ko- k^ + 1){M + 1) coefficients i3r,s 
with f = -M, -M + 1, . . . , M + fco - ^0 and s = 0, 1, . . . , M. 

By these relations, the calculations of the coefficients Pr^s can be performed by 
the procedures in Tabled! where the modules (I), (X) and (D) just correspond to 
dSni), (EZD and ([2H]), respectively. 

It is easily shown that all the coefficients Pr^g are rational-(complex-)valued when 
all the coefficients qm,j of the polynomial in the differential operator are. Hence, if 
all the coefficients qmj are rational- (complex-) valued, so are all the matrix elements 
b"^ (and hence b"^). This fact shows that the condition C7 is satisfied if all the 
coefficients qmj are rational- (complex-) valued. 
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In terms of the unilateral expression, the matrix elements 6^ can be calculated 

by 



M 



S = 

M 



(25) 



(if m + n is even) 



X]^((-l)-+fc+l(m-n)/2),s ^^''0,n)' 
s=0 

M 

. ,,,w (^fcnn)'^ (if m + n is odd) 

^((-l)'"+'=+i(m+n+l)/2j,s ^ '^"'"^ ^ ^ 



V> s=0 



from just the coefficients j3r^s with f = — M, — M + 1, . . . , M + /cq — /cq and 

s = 0, 1, . . . , M. Here note that a matrix element 6^ with n and m such that m + n 

is an odd integer greater than Iq + — (= 2M + 2ko — 2A;q ) vanishes even when 



\m — n\ < io because /3r^s = for r < —M — 1 or f > M + /eq — + 1- This 
fact implies the 'sparseness' of the 'band' in the band-diagonal representation of Bq. 
Moreover, this fact implies that the calculations of the matrix elements 6^ can be 
carried out using only integer-type programs. 

The unilateral expression is somewhat less convenient for practical programs than 
the bilateral one, and we use the bilateral expression in actual calculations. However, 
in this paper, we follow the unilateral expression for consistency with the mathemat- 
ical framework introduced in [4]. 

The coefficients (3r, s with f = -M, -M + 1, . . . , M + ko - k^ and s = 0,1, . . . , M 
can be calculated by the recursive use of the relations ( l20l) . ( 12T|) and ( l22l) . which can 
be realized by the procedures given in Table[T]which are programmable on computers, 
as follows: From these relations, the renewal of the coefficients s in Table [1] can 
be carried out by means of the relations 



2 .. 

-ar 
2 ' 



ri^'tpk-i^n+rix) (26) 



X 



\ f s / f s ^ ' 



k—l,n+r 



[x) (27) 



^ ^r,s n'i;k,n+fix)^ (28) 

Y'Y(^ - {r + k + l)ar,s + {r + l)ar+i,s - ar,s-l + ar+l,s-l^ h''^k+l,h+r{x) . 
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M 

Table 1: Calculation of Pf,s for -M < r < M + ko - ( ^ ri" 



"n+r 



s=0 



for r = -M to M + ko - k^ 
for s = to M 

for m = to M 
for j = to deg 

K i— k 

for f = — m to m-\- ko — k^ 
for s = to M 



. [ 

terate (D) below m times. 

for f = —m to m + kQ ~ 
for s = to m — 1 



ttr, s+1 ^ "r.s + OLr+1, s 

cxf^s ^ ar,s - (r + K + l)Q;r,s + r af+i^ , 
for r = —m to m + /cq — ~ 1 
for s = to m 



K ^ K + 1 

Iterate (X) below j times. 



(X)^ 



for r = —m + 1 to m -\- ko — k^ 
for s = to m 

for f = — m + 1 to m -\- ko 
for s = to m 



[ 



Iterate (I) below ko — k^ — j -\-m times. 

for f = — m + 1 to m-\- ko — k^ 
for s = to m 



(I) { 



a 



r,s 



for f 



for f = — m + 1 to m-\- ko — k^ 
for s = to m 

-ko + k9. — m to m 



for s = to M 



Practical integer-type algorithm for higher-order differential equations 

5 Concrete procedures for the algorithm 
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In this section, we explain the concrete procedures of our algorithm in detail, though 
its basic idea was sketched in the Algorithm in Section 2 of As a first explana- 
tion, for simplicity, we will explain it for the cases with dim V^n£^(Z"'") = 1, though a 
similar method is possible even when dim n£^(Z'^) > 2 as will be shown in Sections 
[HandlHl 

In the following, let 11^ {m G Z"*") be the projector defined in (jl]) of Section [21 
Under the choice of function spaces and basis systems introduced in Section [31 from 

Po = max(2M + ko, 2M + ko - k^) . 

Moreover, we introduce two inner products and their corresponding norms in V and 
HmV under the inequality (Q, 

{f:9)e^K ■= {nKf,IlK9)e2, (/, ^q^at := fi(nAr/, Htv^), (29) 

\\f\\%,K ■■= {fjli^K, \\f\\lN--={f,f)Q,N = mNf), 

where the statement ||/||^2 x= =^ / = is guaranteed, for / G or / G UmV, 
because of C2 and IQ. 

The algorithm consists mainly of three parts, where the first (Step 1 of Algo- 
rithm of [1]) is the calculation of the basis vectors of Ilpgl^ and the second (Step 2) 
is the calculation of the basis vectors of IIat^ and the third (Step 3) is the removal 
of the components which do not belong to IIn{V rii'^{7j'^)) from linear combinations 
of these basis vectors. 

For Step 1 and Step 2, the matrix elements (m, n G Z^; |m— < io, n < N) 
are required. As has been explained in Section [H these elements can be calculated 
from just the coefficients i3r^s ifh^h &'L]h — M<fh<h + M + kQ — k^), by a 
simple substitution of m and n into the power expansion (12^ . Hence the calculation 
of these coefficients suffices, and it should be carried out before Step 1; we regard 
it as a preliminary step Step 0. 

The step Step 3 consists of the three sub-steps Step 3.1.a-Step 3.1.C given 
below. With these remarks, the concrete procedures of the algorithm can be stated, 
as follows: 

Algorithm 

Step Calculation of coefficients Pr,s 

Calculate (3r^s [fh^ii eZ; n — M<m<h + M + ko — k^) by the procedure 
given in Table [H as is explained in Section [H 

Step 1 Calculation of basis vectors of UpoV: 



Practical integer-type algorithm for higher-order differential equations 14 

Find a basis system {-Pi^^j^Lo; • • • > {Fn^^}^=o of UpgV by Gaussian elimination, 
with the matrix elements 6^ calculated by (HM and the result of Step 0, where 
the dimension D is determined automatically by Gaussian elimination. This is 
easy because po is small. 

Step 2 Recursive calculation of the basis vectors of n„V" {po + 1 < n < N): 

Iterate the recursion ( l30l) below for n = po + l,Po + 2, . . . , A^, with the matrix 
elements calculated by and the result of Step 0, in order to obtain a 
basis system {Fi'^j^^g, . . . , {Fi^^}^=o of ^nV. 

Step 3 Removal of components from H^V corresponding to non-£^-ones in V: 

Step 3.1. a Integer-type quasi-orthogonalization of the basis system of IIatI^: 

Find a system of linear combinations {EiP}^^f), . . . , {Ei!^^}^^^ of 

{Fn^}^^Q, . . . , {Fn^^}^^Q which is sufficiently close to an orthogonal sys- 
tem with respect to the inner product (■, ■)q,n, by the procedures ex- 
plained below which is based on an intermediate idea between the Gram- 
Schmidt process and the Euclidean algorithm. 

Step 3.1.b Selection of minimum-ratio vector: 

Find {GW}to ■= {^i'°^^-^}n=o with rfopt. := argmini<,<^ crgU^W). 
Step 3.1.C Truncation (projection) by 11^: 

Project the result of Step 3.1.b to IIj^^. 

From Conditions C2, C5 and the definition ([1]), the calculations in Step 2 can be 
performed by the recursion 

^ n—l 

Pn^=-^ E ^U^i'^ (30) 
n-£o r=n-2£o 

with unchanged Fs'^^ (0 < s < n — 1), for n = po + 1; Po + 2, . . . , N. 

The results of Step 0-Step 2 belong to UnV- However, they contain components 
in IlNV\{UNiV n (= {Un/I f e V\{V n £2(Z+)}) which have nothing to 

do with the true solutions in C^'^(M) nL^^^(M) of the differential equation. Hence, we 
should remove the components in H7vV"\(H^(\/n£2(Z+))). Step 3 can almost remove 
them in the following sense, though we will prove it in detail later in this section. 
The orthogonalization with respect to (■, ■)q,n of the basis system of UnV provides 
us with vectors sufficiently close to Ilj\f(y ni'^{Z'^)) with respect to || ■ \\i2^k such that 
they belong to ((o"j^]v)~^[0' ^'^i?]v]) Section 2. For this, a 'quasi-orthogonalization' 
is sufficient, where the angles between any pair of vectors are sufficiently close to 7r/2, 
as will be shown later. Since exact orthogonalization (without round-off errors) by 
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the Gram-Schmidt process requires many calculations for large N and D, we will use 
the quasi-orthogonalization, without round-off errors but with fewer calculations. 

In the following, we explain in detail how the procedures in Step 3.1.a-Step 
3.1.C can be realized by integer- type programs. 

Since the vector F^'^'^ is rational- (complex-) valued, there is an integer Cd such 
CdF^'^^ is (complex-)integer-valued. Then, Step 3.1. a can be performed 
by the replacement procedures in Table [2] for ■yj'"^*"^^^ = p!^^^^ (^d = 1, . . . , D), with 



that F^^l 



a positive integer h. This is because of the inequality 



< — for 

M\q,n h 



^ j < D, which is guaranteed after the procedures in Table [2l whose halting 
will be proved in Section [HI 

The iteration of Ql in Table |5] looks somewhat like the 'lattice reduction prob- 
lem' [9] [To] (known to be an NP-hard problem), but this iteration is an 'imperfect' 
lattice reduction with few complex calculations which guarantees only the inequali- 
ties 



Re(^Vj,vi^ < -mm{\\vj\\'^,\\ve\\'^) and Im ^-iX,-, 



< - min( \\v 



for I < j < i < D, which are derived from the inequalities 



Re ( a, 6 



(6, a) 



a, a J c 



< jllSI 



and 



Im ( a, 6 



L a, a 



< jiisi 



This iteration is used only as a preliminary procedure for the iteration of Q2, where 
we are not aiming for the exactly minimal basis system of the lattice but instead, 
good enough orthogonality. Moreover, the iteration of Ql can be regarded as a 
combination of the Gram-Schmidt process and a multidimensional complex version 
of the Euclidean algorithm. The final results of these procedures give the vectors 



^ (final) 



(rf=l,2,...,D). 



For Step 3.1.b, with d^pt. 



argmin — ^ — '^'^ for the vectors vj:^^^^'' (d 

r ,11 -.(final) II a \ 

d&{l,...,D} \\v^ 'Wp^K 

(final) 



1, . . . D) after the replacement procedures in Table[2l we then define G^^^ :- 
^dopt. fQj, Step 3. I.e. Then, we have the following theorem: 



V 



Theorem 5.1 When h> D, the vector belongs to ((cr^jy) MO'Caj^jy]) 

whose proof will be given in Section El 

Hence, the following Theorem 15.21 shows the approach of Ilii-G^^^ (obtained in 
Step 3.1.c) to the vector space corresponding to the function space of true solutions 
in C^^(]R\ S) nL^^j(R) (of the differential equation) projected to the i^- dimensional 

subspace < eo, ci, . . . ,eK >, m the sense that converges 



to as tends to infinity: 



^2 
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Table 2: Basic idea of integer- valued quasi-orthogonalization 



16 



In the following, 



^ c := 



-sgn(Rez) [i — |Re2;|J — i sgn(Im2;) — |Im, 



Iterate Ql below until nothing is changed. 

-i sorting and renumbering of Vi, V2, . . . , Vd i^i order that 



Ql < 



\\vi\\q,N < \\v2\\q,N < ■ ■ ■< \\vd\ 

for J = 2 to D 

for £ = 1 to j — 1 









Vj ^ Vj - 







f This is a preparatory 'imperfect' lattice 
I reduction for the iteration of Q2 below. 



Iterate Q2 below until nothing is changed. 
' for J = 2 to 

for £ = 1 to j — 1 



Q2 { 



if h'^ ■ \ {Vj,V^)Q,N? > \\Vj\\l^N 



\ve\ 



[ ^3 ^ 2{;^- 



^ Vj - 



\-{ve,ve)Q,N^'C 
I This process provides us with the vectors \ 



^0 



'1<J<D) s.t. 
for l<j<i<D, 



\{Vj,Ve)Q,N\ 



< 



Theorem 5.2 For fixed K , when N goes to infinity, the convergence 

\\Pv,kx - x\\i2 K , . 

sup r-^ ■ > (31) 



Theorem 2.5 of [1] is equivalent to this, and its proof is given in Appendix of D of [1]. 

For the practical algorithm, the procedures Ql and Q2 can be executed with 
fewer calculations by replacements of the coefficients, instead of vectors, in the ex- 
pansion of the vectors in terms of the initial basis vectors, as shown in Table [3l More- 
over, we can propose a recursive renewal over n for the product ^If}- := Fn^^'F^f}^. 
(r = 0, 1, . . . , 2^0 + I; d = 1,2, . . . , D) instead of direct calculation of the inner prod- 
ucts, which reduces the order of the required amount of calculations of inner products 
between the vectors. 

The order the number of calculations required for obtaining the inner products, 
which is empirically the narrowest bottle neck of our method, is 0(A^^(log A^)^), 

because the number of digts of numerators and denominators of Fi*^"* (n G Z+) can 
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be bounded by O(nlogn) (and the denominators of larger n are always multiples of 
the denominators of smaller n). However, by these modifications, the order of the 
number of calculations required for obtaining the inner products, which is empirically 
the narrowest bottle neck of our method, is 0(A^^(log A^)^). This is not too large 
because the number of significant digits of the solution obtained by our method is 
not fixed, but is an increasing function of N there because of the accuracy increasing 
as N increases, as is expected from the discussion before Theorem 15.21 In practical 
use, we augment the dimension K as well as so that they may be proportional to 
each other, for high accuracy. Then, the number of significant digits of the solution 
obtained increases without limit, which is observed empirically in numerical experi- 
ments. (Empirically, in most cases, the number of significant digits increases almost 
in proportion to a power of A^.) A similar fact can be shown mathematically with the 

order of limits ' lim lim ' because — ^K)f\\i (^Qj^ygj^ggg for /* G V"n£^(Z^ 

K^oo N^oo " "" 



as K tends to infinity. Some of these numerical results will be presented in Section 

Huge integers can be treated by integer arrays for the base- 10^ positional notation 
^q(IO^)^ with inte gers Q which are smaller than 10^. We constructed practical 

e 

program modules for the four arithmetical operations on these integer array expres- 
sions of huge integers. 

Even when Di2 := dim\^ fl £^(Z+) > 2, some modifications enable us to obtain a 
quasi-orthogonal vector system . . . , G^^t'^^ in ((o"/<^]v)~^[0, ccrj^jy]) with re- 

spect to the inner product (■, ■)f2^^. and their projections I\.kG'^^\ HxG*^^\ . . . , H^G^^^^) 
to Hi^V^. The details will be given in Section [71 

6 Suboptimality of the vectors obtained by Step 
3.1. a and Step S.l.b 

In this section, we will prove Theorem 15. Ij which guarantees that the vectors G^^^ 
obtained by Step 3.1. a and StepS.l.b belong to ((o"^]y)~"'^[0' '^'^x ]v]) with finite 
fixed c. For this, we begin with a lemma. 

Lemma 6.1 Let U be a finite dimensional space and (■, ■)a and (■, ■)= be inner prod- 
ucts there. If nonzero vectors /i, /2, ■ ■ ■ fn (n < dim U) in U satisfy — 1^'^"^' '^^^^^ — < 

1 1 fm 1 1 H ■ II fm 1 1 S 

( {1 < m < i < n) with fixed ( such that < ( < , then the inequality 

n — 1 



Practical integer-type algorithm for higher-order differential equations 

Table 3: Practical operations for integer-type quasi-orthogonalization 



for J = 1 to Z? 

for m = 1 to D 



Vjm ^ \Vj , Vm )q,N 

/ -^(initial) -»(initial)\ 

I Cjm '■ complex-integer-velued coefficients 

D 



V 



for = V c,>„tii'""-') 



J 



Iterate PI below until nothing is changed. 

• with a permutation ni,n2, . ■ . ,nD of 

1,2,...,D S.t. Pll < P22 < ■ ■ ■ < Pdd, 

for j = 1 to D 

• for j = 2 to D 

for £ = 1 to j — 1 



PI < 



Pit 



r <— 

' Pu^ 
for m = 1 to D 

j_ ^jm ^ ^jm ^ 7 Pjm ^ Pjm ^ Ptm 7 Q.jm ^ Q.jm Qin 

for m = 1 to D 

I Pmj ^ Pmj ^ PmZ ? j Q.mj ^ 



Iterate P2 below until nothing is changed. 
' for j = 2 to D 
for ^ = 1 to j - 1 



P2 < 



if 



Pit 



Pit 



> 



Pa 



Pu 



for m = 1 to D 

Pmi ^ 2p^j , (Imj ^ 2cifyij 

Pjm ^ '^Pjmj Qjm ^ "^Qjm 

'Pit' 



^PU 



r -ir- 

for m = 1 to D 

L ^jm ^ ^jm ^ ^tm •> Pjm ^ Pjm ^ Ptm ; ^jm 

for m = 1 to 

I Pmj ^ Pmj -rpmt, qmj ^ Qmj - r Qme. 



Qjm - r qtm 



for j = 1 to 

D 

^ (final) 



E ,7 (initial) 



m=l 

for ^ = 1 to D 



if j > 1 or m > 1, 

^ (final) -.(final) \ , \^ \^ 

^'"l }Q,N ^ 2^ 2^ CjmCe.n Prr 



D D 



m = l n=l 
D D 



/^(final) ^(final)\ , 

{Vj ,Vl )P,K^2^2^ CjmCtn q„ 



m=l n=l 
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n n 

Proof of Lemma 16.11 For / := ctm/m, define A := kmni/m|||- With the 



definitions Pm '■ 



fn 



Cn 



ll/mlU 



m=l 



and hjn ■= am ||/m||H, we have 



Wfmh \\U 

n n 

< C, \\Pm\\A = Cm, / = X] bmPm and A = Then, 



m=l 



m=l 



Tt 71 Ti Th TL 



m=l 

On the other hand, 



n 



m=l 



n n 



m=l 



m=l 



m=l 



^ bmPm\\^ = ^^bmbe{Pm,Peh > -(X^ l^ml) C + |&mP(l + C) 



-'mPm 3 
m=l m=l £=1 

n n 



m=l 



m=l 



> -iY, M {J2^)C + M^ + = -AnC + A{1 + = A{1 - (n - 1) C). 



m=l 



m=l 



Hence 



< 



m=l 



II/" 



m\\A 



1 ll/mlls 



I" A{l-{n-l)C) l-(^-l)C 



This lemma enables us to prove the following theorem: 

Theorem 6.1 Let U be a finite dimensional space and (■, ■)a and (-, ■)= be inner 



products there. If nonzero vectors fi, /2, ... /„ {n < dimf/) in U satisfy 



|(/m, /^)5| 
1 1 fm 1 1 H ■ II fm 1 1 H 



(1 < m < £ < n) wt/i /ixec? C suc/i i/ia^ < C < 



n — 1' 



then 



mm 



< 



inf 



\u\ 



j \\Vj\\A y 1 — (n — 1)C «ec/\{o} ||m||a 
Proof of Theorem 16.11 From Lemma 16.11 



sup < 

ueu\{o} Imls 



E 



11/ 



mil A 



m=l II /"^ II I . ' II /ill I . / ^ 

^ i-(n-i)c - \ i-(n-i)c - V i-(n-i)c ""1 



n max 



jllA 



max 



ill A 



J 11 = 



This implies that 



. „ ||u||h ^ /l-(n-l)C . ujjwr- 

mi > \ mm 

u&u\{o} Ma V n j 



ill A 
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By means of Theorem 16.11 we can now prove Theorem 15.11 
Proof of rfeeorem 15.11 After the process Step 3.1. a, the vectors vl^^^^\ . . . vj^"^^^^ 

satisfy the conditions in Theorem 16. Ij with 11 ■ 11= = 11 ■ \\n n and ( = —. Hence, with 

' 1 

U = UnV, II ■ ||a = II ■ 11^2^;^, II • ||a = II ■ IIq.at, n = D and C = ^> when h>D 



II ^11 I r-i ll^(final)|| 

\\g\\e2,K , / D \\v^ '\\e2K 
sup ' < J n— r uiax — 77 — n ■ 

senJxm \\9\\q,n - V '='-^ \\v!t^^^\\Q,N 

From the definition of G^^^ (made just before the theorem), this imphes that 



aP^^GW) = lS^< min < a/^^ i^f Hr^ ■ 

||G(i)||,2,^ " d=i,...,D ||{;f '^^')||,2,^ " V ^ " ^ 9en^v\{0} Wgy^K 

This inequahty shows that the vector G^-^'^ belongs to ((crj^]v)~^[0, ccr^jy]) with 

D - '~ 

c = , from the definition of G in the procedures for Step 3.2. ■ 

Thus we have proved the vahdity of our method, by means of Theorem 15.11 and 
Theorem 15.21 

Remark 6.1 For a practical algorithm, we propose an improvement where 
the iteration of P2 in Table |3] is made also in the halfway steps in the 
recursion Step 2. This improvement can reduce the size of the integers, 
and does not influence the structure of the algorithm because the iteration 
of P2 results only in a change of basis system. 



7 Extension to the cases with dim V H £^(Z+) > 2 

In this section, we explain how to extend the proposed method to the cases with 
dim n > 2, i.e. how to obtain approximately a quasi-orthogonal basis 

system of the subspace V fl £^(Z+) with respect to (■, ■)i2^K- In Algorithm of our 
preceding paper [1], for simplicity, we explained Step 3.n {n > 2) using exact 
orthogonalization with respect to (■,-)^2^. However, the exact orthogonalization 
requires a large amount of calculations and hence it is not practical. Moreover, the 
proposed method does not necessarily require the exact orthogonalization but quasi- 
orthogonalization is sufficient as is proved later. Therefore, in this paper, for Step 
3.n, we propose a practical procedures based on an quasi-orthogonalization which 
requires a relatively small amount of calculations. This is a slight difference between 
the algorithms presented in [1] and this paper. For this difference, here we use step 
numbers Step 3'. A and Step 3'. A.. . . 

When dim V"n£^(Z+) = 2, the extension is possible, based on the idea of the quasi- 

minimization of the ratio ^^'^^^'^'^ in the subspace almost orthogonal to G'-^^ Let 

Wfh^K 
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G be the result of this quasi-minimization. Similar extension is possible even when 
dim V^n£^(Z+) > 3, based on the idea of recursive iteration of the quasi-minimization 
of this ratio in the subspace almost orthogonal to the span of the already obtained 
vectors G^^\G^'^\ . . . . Thus we can obtain as many linearly independent vectors as 
dimK n each of which belongs to ((crf]v)"MO, ccrj^jy]) ^iid hence is close to 

V n £^(Z^). This provides us with an approximate quasi-orthogonal basis system 
of \^ n £^(Z"*") with respect to (■, ■)e^,K, i-e., with an approximation of the 'general 
solution' of the differential equation. 

Here, we should be careful of the fact that a linear combination of a set of vectors 
is not always close to \^ fl even if all the vectors in this set are close to 

V n £^(Z+) (in the sense of the angles between the vectors and the space). However, 
as is shown later in Section [HI when the set of vectors form a quasi-orthogonal system 
with respect to (■, ■)p,Ki any nonzero linear combination of them is close to V"n£^(Z+). 
Hence, the idea mentioned above for the extension does not suffer from this problem. 

In this section and the next one, we will explain the details of the extension. 
The procedures Step 0-Step 2 do not change at all with this extension, because a 
basis system of H^rV^ is required there in the same sense as the one-dimensional case. 
Hence, we have only to explain how to modify Step 3 for this extension. In this 
section, we explain only how to modify the procedures; their validity will be proved 
in Section [HI 

For a concrete description of this, we provide some preliminary notation. Let d 
be the number of 'already obtained vectors' G^'^^ {d = 1,2, . . . by the method 
mentioned above, and let :=< G^'^\ . . . ,G^'^^ > where we set Tq := {0}. Let 
i?j denote the subspace of Htv^ satisfying UnV = Tj©i?j, in which we find another 
quasi-minimum-ratio vector at the next step. By the procedures in Step S'.A.l 
below, this subspace is chosen to be very close to the orthogonal complement 
IlNV-^'^d of Tj with respect to (•, ■)e2^K, but it is not always exactly equal to the 
latter, where 'closeness' is used in the sense that the angles between nonzero vectors 
in and nonzero vectors in i?j are close to — with respect to (■, ■)e2^j^. Obviously, 

dimTj = d and dimi?j = N - d. Moreover, let D(,2 = dimm f{Z+). 

With these notations, for the extension to the cases where Dj > 2, the procedures 
in Step3 are replaced by 

Step 3' Removal of components from UnV corresponding to non-£^-ones in HjvV^: 

Step 3'. A Integer-type extraction of a quasi-orthogonal basis system for UisfiVn 
£2(Z+)): 

Iterate the series of steps Step 3'. A. al— Step 3'. A.b2 below (once in this 
order) for d = 0, 1, , . . . , Df2 — 1 with the initial subspaces Tq = {0}. (If 
we require an upper bound for errors, iterate this for (i = 0, 1, , . . . , Di2.) 
The result of each step of this iteration gives a quasi-orthogonal basis 
system G^-^\ G^'^\ . . . G^'^\ of Tj with respect to (■, ■)(2^j^. (This substep 
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Table 4: Basic idea of integer- valued quasi-orthogonalization (general) 



if d > 

• Iterate SI below until nothing is changed. 



' • sorting and renumbering of v^^i, ■ ■ ■ ^ 



SI < 



in order that \\v 



d+l 



\p-,K < lFd+2lk^,^ < • • • < II^^dII^^^;,; 



• for j = d + 2 to D 

for ^ = 1 to j - 1 



Vi <— 



This is a preparatory partial lattice reduction for Q2 below. 



• Iterate S2 below until nothing is changed. 
' for j = J + 2 to £» 
for £ = 1 to j - 1 



S2 



if g'^\{vj,ve)e2^K\'^ > WvjW^^j^WviW^^i^, 



/ This process provides us with the vectors Vj (1 < i < -D) \ 

\{vj,ve)i^K\ ^ 1 

\\V:i\\f2,K ■ \\vt\\e--.K ~ (] 
\ lor (/ + 1 < J < (. < D or 1 < .y < (/ + 1 < / < D. 



J 



Iterate Q'l below until nothing is changed. 

' • sorting and renumbering of • ■ ■ ' 

in order that Wv^+^Wq^n < \\vi+2\\Q,N < ■ ■ ■< H^^Dllg.iv 
• for j = d + 2 to L> 

for ^ = J + 1 to j - 1 

{Vj,V£)Q,N' 



Q'l < 



{V£, Vi)Q,N 



( This is a preparatory partial lattice reduction for P2 below. 



Iterate Q'2 below until nothing is changed. 
' for j = J + 2 to £> 

for ^ = J + 1 to j - 1 

if h^\{Vj,Ve)Q,N\''> \\Vj\\Q,N 

{vj,ve)Q,N 



Q'2 < 



.AT) 



{ve,ve)Q,N- 

( This process provides us with the vectors (1 < i < -D) 

l(^i,^^.)Q.iv| _ < 1 for d+ 1 < j < £ < and 
N h 



s.t. 



Pj\\ei,K 



1 

< - 



D-d 



D-d-\ 



for 1 < j < d + 1 < £ < £>. 
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Table 5: Practical operations for integer-type quasi-orthogonalization (general) 



for j = 1 to D 
for m = 1 to D 

Pjm <— \Vj , Vm )q,N 

/-'(initial) ^(initial)\ 



I Cjm '■ complex-integer-valued coefficients \ 

D 



for V. 



(new) 



E 



-.(initial) 



m— 1 

V Pjrrn Ijm ■ inner products 



if d > 

• Iterate Rl below until nothing is changed. 

• with a permutation ^ij+i, ■ ■ ■ ,nD of 
d + l,d+2,...,D s.t. Qi+ia+i < 9d+2<i+2 < • ■ ■ < Qdo, 
for j = 1 to D 

\_Cjm C^jm: Pjm PrijUmi Qjm PrijUm 



Rl < 



• for j = d+2to D 
for £ = 1 to J — 1 



r <— 
for m = 1 to D 

_Pjm ^ Pjin ^ P£mj Qjm 

for rn = I to D 



Pn 



Pnij "r Pm^j Qmj 



- Ijm - r qtm 
qmj -fqme 



• Iterate R2 below until nothing is changed 
for J =d + 2 to D 
for £ = 1 to j — 1 



R2 < 



if g' 



c 



(lit 



> 



In 



+ 1 



gee 

NK. 



for m = I to D 

Pmj ^ ^Prn j i 
Pjm ^ 2pj-fjij 

-Qjl] 



Qmj ^Qmj 



r 

for m 



= 1 to £> 

^ Pjm - rpi„ 
^ qjm - r qtm 



Pjn 
qjrr 

for TO = 1 to D 

Pmj ^ Pmj ^ Pmi 
_ qmj qmj ^ qm£ 



(to be continued to the next page) 
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(continued from the previous page) 



Iterate P'l below until nothing is changed. 

• with a permutation nj^j, nj_|_2, . . . , nu of 
d+l,d + 2,...,D s.t. pj+i j+i < Pd+2d+2 < 
for j = 1 to D 

\_^jm ^ ^n-jirif Pjm ^ Pn-jUmt Qjm 

i for j = d+2to D 

for f. = d+ltoj-l 



P'l < 



PfT/j Tim 



r <— 

for m = 1 to Z? 



Pnn ^ Pr 



Ptm ) Qjm 



< Pdd, 



for m = 1 to Z? 

I Pmj ^ Pmj ^ Pm^ ? Qmj 



Qmj — T 1w£ 



Iterate P'2 below until nothing is changed, 
for J = rf + 2 to D 
for £ = J + 1 to J - 1 



P'2 < 



if e 



Pje 



> 



Pjj 



for m = I to D 

Pmj ^ '^Pmj ; Qmj ^ ^Qmj 

Pjm ^ ^Pjmi Qjm ^ ^Qjm 

-Pjt] 



r <— 
for m : 

Cjm 
Pjm 

for m = 

- \_Pmj 



Pel 



1 to D 

- Pjm ^ Piim Qjm 

1 to D 

~ Pmj Pm£j Qmj 



■ Qjm - r qtm 
Qmj -rqme 



tor j = d-\-lto D 



^ (final) 



/ , ''im'-'m 



(initial) 



for £ = 1 to L> 

iij>d+loYm>d+l, 



D D 

? (final) (final) X , \^ \^ 

m—1 n— 1 
D D 

(final) -♦(final) 



CjYn^&n Pmn 



for J = 1 to D 
for d = 1 to D 

D 

(final) 



'j,d 



(initial) 



m=l n=l 

( Qj^d, '■ complex-integer-valued \ 

D 

coefficients s.t. vj = aj,dF^ 



int. 



m=l 



d=l 



J 
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is an extension of Step 3.1.a+Step S.l.b. Each iteration with d = n of 
this substep is corresponding to Step 3.n of [4J, with a shght difference 
between exact orthogonahzation and quasi-orthogonahzation with respect 

Step 3'. A.al Integer-type quasi-orthogonahzation of the basis system of 

If 0? > 1, find a system of N—d linear combinations Vi"^^, ■w^'^'*, . . . , v'^'^'^^ 

of F^^J , F;^^-* , . . . , F^^^ which is sufficiently close to an orthogonal sys- 
tem with respect to the inner product (■, ■)e2^x and also almost or- 
thogonal to Tj with respect to (■, ■)i2^K, by the procedures explained 
below which are based on an idea intermediate between the Gram- 
Schmidt process and the Euclidean algorithm. Then, if d > 1, choose 
the subspace i?j to be the span of the N — d linear combinations 



^1 ' ^2 , • • • , 



^f^r obtained by this substep. This substep is omit- 



ted by the simple substitution v^'^^ := F^^^^ {d = 1,2, . . . , D) and 
Rq := IIatV^ when d = 0. 
Step 3'.A.a2 Integer-type quasi-orthogonalization of basis system of 

Find a system of N — d linear combinations uf'^^, u^*^^, . . . , ^^'^'^ 

of the N — d basis vectors "'^j ■y^'^^, . . . ,v^'^Sd R-d obtained in 
Step 3'. A.al which is sufficiently close to an orthogonal system with 
respect to the inner product (■, ■)q,n, by the procedures explained 
below which is based on an intermediate idea between the Gram- 
Schmidt process and the Euclidean algorithm. 
Step 3'.A.bl Selection of minimum-ratio vector: 

Find the linear combination G^'^~^^'> with minimum ratio | — li^r!^ in 

the N — d hnear combinations uf'^^, u'^'^^ , ■ ■ ■ , u^^i^ obtained by Step 
3'.A.a2. 

Step 3'.A.b2 Innovation of space Tj: 

Let Tj+i := © {aG^^+^^ | a G C} with G^^+^^ obtained by Step 
3'.A.bl. 



Step 3'.B.c Truncation (projection) to n^V^: 

Project the vectors G '^'^^ (ci = 1, 2, . . . , Dp) obtained by Step 3'. A by 11^^. 

In the following, we explain how to realize these procedures, in detail. These 
procedures can be described in a unified framework of an iterative change of the 
basis systems of liNV- In this framework, all the basis vectors of n^vl^ at the 
intermediate steps with d = 0,1,..., Di2 in Step 3'. A above are denoted by V^"^^ 
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{d = 1,2, . . . , D). With these notations, the basis vectors of IItvV^ just after Step 
S'.A.al with d are 



while they are changed to 



<d> 



y<.> 



G^^^ {iid<d) 



v^^^ {ifd>d + l) 



G^'^^ {if d<d) 



u^l} {iid>d+l) 



by Step 3'.A.a2 with d. Hence, Tj =< V^^> , V^^> , vf> > while 

i?j =< Vf'^r',Vf'^^,...,V^'^^ >. In each substep, they are expressed as linear 



D 



combinations V<'^> = ^ a<^> Fj^J^ of the initial basis vectors F.}^\ F^f^ 

d=0 

(obtained by Step 2) with the coefficients cb^f' {j^d = 1,2, . . . , D; d = 0,1, . . . , Dp). 

At the initial step of Step 3', with d = 0, let V<^> := F-^^ (d = 1,2, . . . , D). 
From this initial basis system of UnV, perform the following concrete procedures: 

(*) For Step S'.A.al, with Vd = V^^^^ {d = 1,2, . . . , D), do the iteration of Rl 
and then the iteration of R2 in Table H] with a sufficiently large integer g. (How to 
choose g will be explained later.) These procedures are omitted for the exceptional 
case of d = 0. In the result of these iterations, the vectors Vj {d+1 < j < D) satisfy 
\{vj,ve)i2^K\ < i for J+ 1 < ^- < £ < or 1 < < J+ 1 < £ < J > 1^ 



Next, for Step 3'. A. 2, with Vd {d = 1,2, . . . , D) renewed above, do the iteration 
of P'l and then the iteration of P'2 in Table HI 

Next, for Step 3'. A. 3, with d^^{^ := argmin || '^^j^''^ for the vectors Vd 

d€{d+l,...,D} \\^d\\e'2,K 

{d = d+ 1, . . . D) after these procedures, define := t?,<d> I = u"^^^-, _ ) . In the 

opt- \ '^opt. -d- J 

unified framework of change of the basis system mentioned above, this is equivalent 
to V^^^^^ := v^^iy. This is an implicit process for Step 3'. A. 4. Here note that 

the basis vector V^^^^^ is fixed exactly then and it remains fixed. At any steps in 
the procedures in Table HI the vectors V^'^^ with d < d are not changed. Moreover, 
define V^/'^^^^ := Vd^^> ford =1,2,... d. 

With the above innovation of the basis system, the basis vectors V^^'^^^^ 

(1 < j < D) satisfy -^-J ' ^ ^ - — < - for + 1 < j < ^ < D and 

W^j \\q,n ■ \\Vi \\q,n 
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they satisfy 



<d+l> y<d+l> 



K\ 



|T/<d+l>|| ^ 



IK 



<d+l> I 



< 



9 



D-d 



D-d-1 
9 



for 



l<j<d + l<£<D when g > D — d, where the latter inequahty will be proved 
in Section |8l 

Since the vectors V^'^^^^ with d = 1, 2, ... d have been fixed to be G from the 
discussions above, the following quasi-orthogonalities (a)-(e) are guaranteed simul- 
taneously when d > 1 and 9 > D — d. 

Lemma 7.1 For the vector systems defined above with d> 1, the inequalities below 
are satisfied: 



[a): The basis system ofT^ satisfies 



1 

— < - 

,K 9 



\ 



D-d 



D-d-1 



for 1 < J < i < d. 



9 



(b): The old basis system of satisfies 



l<j<i<D-d. 



\Vj ^^e IP, 



K\ 



\vf>\\,.^K-\\vf'^\V^K 9 



^ 1 f 

< — for 



[c): The new basis system of satisfies 



N\ 



\^j \\q,n ■ \\u^ Wq^n 



< - for 



l<j<£<D-d. 

(d) : Between the basis system of and the old basis system of R^, the inequality 



K\ 



I GO-) I 



,K ■ \\Vi 



<d> I 



— <- holds for 1 < j < d and 1 < i < N - d. 
J< 9 



Between the basis system ofT^ and the new basis system of R^, the inequality 

holds for I < j < d and 



K\ 



\G(^)\U,K-\\uf''>\k2K 



l<i<N-d. 



1 

< - 
9 



\ 



D-d 



1 - 



D-d-1 
9 



The proof of (a) and (e) is given in the first part of Section [HI whereas (b)-(d) is 
obvious by Table HI Here, note that the quasi-orthogonality (c) is with respect to 
-)q,n while the others are with respect to (■, ■)^2^^. With h > D — d, the quasi- 



orthogonality (c) guarantees that = u'^^f^ - belongs to {{ctk^n) ^['^^'^'^k,n]) 



with c 



D-d 



D-d-1 



from a similar discussion to Section El with D-d instead of 
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D. Hence, the convergence to nx(y n£^(Z"'")) is similarly guaranteed by Theorem 
15.21 The quasi-orthogonality (a) is essential later in order to prove that any nonzero 
vector in T^^^ (i.e., any nonzero linear combination of G^^\G^'^\ . . . G^^^'^^) belongs 
to ((cTii^jv) ""^[0' '^'^XTv]) with a fixed finite c, as was mentioned briefiy at the beginning 
of this section. 

Next, as a preparation for the next step, we choose the basis vectors V^^l^^^, ^/t^^^; • • • ; V^'^'^^^ 

so that V^'^^^'^ , \?2^'^+^>, . . . , V^'^+^^ may be a basis system for UnV . For this, linear 
independence suffices, because dimllArl^ = D. An easy method is the choice of the 
vectors other than u^^^^ - in uf'^^ ,uf'^^, . . . u^^~ (i.e., the vectors other than ,<d> 

"opt. " ^P^- 

int wj+i, ■■■Vd after Step 3'.A.3 ) for the vectors V^_^^^> , V^^^^^ , V^^^^> ■ 
However, this method requires more calculations than the choice from the initial 
basis vectors F^^^ , F^^^ , . . . , F^l^\ because the quasi-orthogonalization procedures 
for Step 3'. 2 with respect to (■, ■)q,n have made the basis vectors almost parallel 

to one another with respect to (■, ■)e2^x- Hence, a choice from -Fj^J^ ^int.^ ■ ■ ■ j ^h^.^ 
is desirable. The check of linear independence can be made then with only the 
coefficients ij,d = 1,2, as follows; with the definitions of vectors 

ttd by {dd)j ■= dj^d and the set 5* of the D — d — 1 numbers of the chosen vec- 
tors from Fj^^-*, . . . ,F^^ such that the statement n G S' is equivalent to the 
statement that the vector F-j^ is chosen as one of the basis vectors, where the lin- 
ear independence is guaranteed if the vector system {a^ \ d G {1,2, .. . ,D}\S^} is 
linearly independent. Since there exists at least one choice of S such that this sys- 
tem is linearly independent, which is easily shown from the linear independence of 
V<d+i>^ y<J+i>^ _ _ _ ^ V<^+^>^ this type of choice always exists. 

In other words, there exist integers n^j^^i ^d+s: ■ ■ ■ such that the vector system 

^<J+l>^ y<d+l>^ ^ ^<d+l>^ ^K>3)^ ^ pi^^n) ^ b^gig gyg^g^ j^^y 

The set of these integers should satisfy only the following conditions (i) and (ii): 

(i): The vector system \^dd \ d E {1,2,..., D}\{n^j^^, ^d+i^ • • • ' "^d}} is linearly in- 
dependent. 

(n): Um ^ li m ^ i. 

By the check of coefficients a^^^^^, these integers can be easily chosen. Empirically, 

except for very special cases with simple symmetry, most choices oiD — d— \ distinct 
numbers from {1,2, give the linear independence of (i). We have only to 

check the linear independence of (i) with an arbitrary choice of {n^j^^i ■ ■ ■ i'^d^ 

satisfying (ii), by the coefficients a^^^^"^. If, exceptionally, linear independence is 
not satisfied, replace one of these numbers by another, and try it again. How to 
determine the coefficients a^^'*'^^ easily will be explained later, in the explanation of 
practical operations with a reduced amount of calculations. 
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With these integers, define Vf^^^^ := F-^^^'^ for d = d + 2, d + 3, . . . , D. Then, 
with the increment d — )■ d + 1, return to the top (*) of the procedures for Step 
3'.A.al if J< Di2. 

By means of the iterations explained above, we can obtain a quasi-orthogonal 
vector system (with respect to (•, •)£2^^) which satisfies the following theorem: 

Theorem 7.1 When h > Dp and 

9^2 (^\/ i^P ^ + ^Dp^Dp — 1) + {Dp — any nonzero linear combina- 
tion of G^^\G^'^\ . . .G^^i^^ obtained by the above iterations (Step 3') belongs to 
((4%)-'[0,ca|]v]) ^^th 



Dp 

c{g,h,Dp) := - 



Dp 



1 _ R£ 



Dp — 1 / Dp 



9 

V 9 

The proof is given in the last part of Section |8l This theorem implies that the 
vector system II^^G*^^), 11^^(7*^^^ . . . n^^G'^'^^^) jg approximately a quasi-orthogonal 
basis system for IlKiY H £^(Z+)), because of Theorem 15.21 

As was mentioned in Section |5] for one-dimensional cases, the number of calcula- 
tions can be remarkably reduced by some modifications. With these modifications, 
we propose a practical realization in Table |5] instead of the procedures in Table |H 
In this practical realization, the innovation of the coefficients a^^'*'^^ is made at the 
last step. 



8 Suboptimality under the extension to multidi- 
mensional cases 

In this section, proving Lemma 17.11 and Theorem I7.H we show the validity of the 
procedures in extension Step 3' which has been proposed in Section [7] as an extension 
of Step 3 to the cases where dimV^ n £2(Z+) > 2. 

First, as a preliminary process, we will show the quasi-orthogonality 

\/-r}<d+i> t7<J+i>\ I ^ 



< - 

|T7<d+i>|| ^ iiT7<'i+i>ii a 

I \\P,K II \\P,K ^ 



\ 



^^-^ il< J <d+l<l< D) 

l_P-d-l - - J 



9 

with respect to (■, ■)p^k in (a) and (e) of Lemma [7?n Since the vectors V^'^'^^^^ with 
d + 1 < £ < D are linear combinations of the vectors iTf v^'^^ , . . . , v'^'^^- (i.e., of the 
vectors v;ij^^, ■ ■ - Vd after the iterations of SI and S2 in TableH]) where the quasi- 

orthogonahty — ^-1^ — '--^ — — <-(l<j<i<D — d)is guaranteed, the 

|Lt<<i>|| „ IUt<'^>ll „ n 

\\i^,K ■ Wp^k 9 
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above quasi-orthogonality for V^'^^^^ , i.e., (a) and (e) of Lemma [7. II can be shown 



3 



directly from the following lemma, with n = D — d, Vm = u"^^ {m = 1,2, . . . , D — d) 

and C = e = - : 
9 

Lemma 8.1 When any pair of two distinct vectors in the set {vi, V2, ■ ■ ■ Vn} satisfies 

j^"^,' < C (with fixed ( such that < C < ) (^nd there is a vector u 

\\vm\\ ■ \\ve\\ n-l ' 

such that the inequality ^ < e holds for m = 1,2, ... ,n, then the inequality 

\\Vm\\ ■ \\U\\ 

' ^ ^ \ 7 TT holds for any vector w G< Vi,V2, . . . w„ > \{0} . 



||w|| ■ IImII y 1 — {n — 1) C, 

y ■ ^ 

Proof of Lemma I8.lt Define Pj := ,, J ,, (?' = l,2,...,n) and q := -rr-^. Then, 

\\u\\ ^ 

from the condition of the lemma, for any pair of distinct vectors in the set {pj \ j = 
1,2, ...,n} , the inequality \{pm, Pi)] < C holds. Moreover, the inequality \{pm, q)\ < 
e holds for m = 1, 2, . . . , n. Since w G< vi, V2, . . .Vn > \{0}, there is a set of complex 

n n 

coefficients bj (j = 1,2, ..,n) such that w = beflj and A := jfo^p > 0. From 

e=i 1=1 
the Schwarz inequality and the condition of this lemma, the following two inequalities 

can be derived: 

n n n 

i=i e=i E=i 



£=1 e'=i 1=1 1=1 

n n 

> -(El^^l')(El)C + (l + 0^ = (1 - in-l)C)A 

1=1 £=1 

where we have utilized the relation {j)i,pi>) = 1 = (1 + C) — C • Since A> Q, 

■ 

In the following, by means of this quasi-orthogonality and Lemma 16. H we prove 
Theorem 17. H i.e., the convergence of any nonzero linear combination of the vectors 
IiKG^^\liKG^'^\ . ..HkG^^^^^ to n^(V^n£2(Z+)), under the choice of the integers h 

and g such that h > Dp and 9 > ij^ (y\/{De,2 - 1)2 -\-ADii2{Dp - 1) + {Dp - 1) j . 
Proof of Theorem U .1\ From the conditions for h and g, the inequalities 
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1 ^ > 0, 1 > and 1 , / ^ — - > hold. Let W 

h ' g g \l 1-^^ 

be the span of the vectors G^^\G^'^\ . . . G''^e^\ This is a subspace of IItvV^- As has 
been explained in Section [TJ the vector G'^'^^ belongs to {{o'^^^)~^[0,ca^^^^]) with 



1 



^ -^"Y, and hence it belongs to ((o"j^]v) ^[0) d 2-1 ' "^^i^lv]) because 



h 

^ < n -1 • Moreover, as has been shown in Lemma 17.11 with the 



^ h h 

above proof, these vectors satisfy the quasi-orthogonality 



\GmpK-\\Gmp^K ~9 



Df2 — d 1 D02 



^_ Dp^- d-l g\l 1 



9 

Hence, from Lemma WT\ with (-, ■)a = (■, ■)^2 (■, ■)= = (■, ■)q n and 
;^i,pf7. we can show the inequalities 



||/=y(m)||2 I pr~ 



£'-,2-1 



for any vector w in < G'^^\ G^'^\ . . . , G^^e^^ > \{0}. This implies that any nonzero 
linear combination of G^^\G^^\ . . . G^^^^) belongs to ((ag]v)~MO, c{g, h, ^2)^^^^]^]) 
with 



De2 

c{g,h,De2) := - 



D,2 



1 _ ^^2-1 



^ Z)^2 — 1 j Dp 



1 _ ^.2-1 



This fact guarantees the convergence of any nonzero linear combination of the vectors 
nxGW,nxG(2),...ni^G(^f2) ^ ni^(y n £2(Z+)) in the same sense as Section El 
under the choice of integers h and g such that h > Dp and 
9>\ [^/{Dp~iy + ADp{Dp-l) + {Dp - 1)). 
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9 Proof of the halting of Step 3.1 

Since the procedures in Table [2] and Table [3] for the process Step 3.1 contain iter- 
ations which finish only under certain conditions, we should verify that they halt. 
Otherwise, the algorithm could not be executed. In this section, we prove that they 
halt in a finite number of steps. 

The main idea used in the proof of halting is based on the finiteness of the number 
of vectors in a lattice with bounded norms and the monotonic decrease of the norm, 
except for finitely many times, in the execution of Vj 2vj in Q2 in Table HI 

It is easily shown that the vectors vi,vi, . . . ,vd satisfy 

1 1 

|Re {vm, ve)u\ < - \\ve\\l , |Im {v^, ve)u\ < - IWiWl (32) 

and hence \{vm, ve)u\^ < ^\\ve\\i , if m^i. 

after the iteration of Ql of Table |21 

This inequality yields the following theorem: 

Theorem 9.1 The procedures in TablelB halt within a finite number of steps. 

For the proof of this, we begin with some preliminary definitions and a lemma: 

Definition 9.1 For a set of n vectors Ui, U2, ■ ■ ■ , Un, define 

n 

Latt(Mi, 'U2, . . . , Un) '■= ^i'^i I G Z + Zi (m = 1, . . . n)} . 

e=i 

Obviously, Latt(ui,'U2, . . . ,u„) C< Ui,U2, . . . ,Un> 

Definition 9.2 For a set of n vectors Ui,U2, . . . ,Un in a linear space U with norm 

D 

II ■ lU, define sh(mi, . . . , Un) '■= ll^jlU o^nd 

fi'\ui,...,Un) := { (/i,...,/z)) G (Latt(Mi, . . . s^ifi, ■ ■ ■ Jn) < s} . 

Lemma 9.1 Let U be a linear space and (-, ■) be an inner product there. Then, for 
a set of n vectors Ui,U2, ■ ■ ■ ,Un in U , with norm \\f\\ := \J {f, f), 

-* In 
sup inf II/ — .^11 ^ \/ 77 i^^^ ll'^rll • 

/£<«!,...,«„> V 2 re{l,...n} 
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Proof of Lemma l9.ll This lemma will be proved by mathematical induction. When 
n = 1, the statement in the lemma holds because 



■h-Hy<h 



Let P<ni,. ..,«,,> be the orthogonal projector to < ui, . . . , -u^ > with respect to (■,■). If 
the statement in the lemma holds with n = n', then 

"^f e<ui,...,Un' >, -^<^x<^ and ~ ^ <^ y < ^, 
sup inf 11/ + (x + iy)un'+i - g\\ 

2 



< W— max IIm^II + \\P<uu...,ur>{x + iy)un'+i\\^ 

Z re{l, .■■,"'} 

<^ max \\ur\\^ + ^ ||Mn'+iP < ^ ^ max 

2 re{l,...,n'} 2 2 r6{l,...,n'+l} 

This implies that the statement in the lemma holds also for n = n' + 1, because any 
vector h in < ui, . . . ,Un'+i > can be decomposed as h = f + {x + iy)un'+i + zUn'+i 
with / G< Ui, . . . ,Un' >, — I < a; < |, — | < y < | and z G Z + Zi. ■ 

Proof of Theorem 19.11 Let Vi,V2, ■ ■ ■ ,V d be the vectors Vi,V2, ■ ■ ■ ,vd after the 
iteration of Ql and before the iteration of Q2 in Table |2l and let Vi, V2, . . . , Vd be 
the vectors Vi,V2, ■ ■ ■ ,V£) after all the procedures in Tabled Since F-j^l, F-mt.^ • • • ; ^in±^ 

are linearly independent, so also are Vi,V2, ■ ■ ■ ,Vd, because Ql does not change 
linear independence. Hence, for 2 < j < D, with the orthogonal projector 

P ^ -5 to < 1^1, ... , Vj-i > with respect to (-, ■)q^n, 

max ||M,.||Q^Ar 

L,- := l|(l — P ~ ~ > 0, and ^ is finite, which guaran- 

tees the existence of an integer Kj such that 2^^L^ e > \ - max 11 Mr II o n 
On the other hand, from Lemma [9. II with n = 1, for £ = 1, 2, . . . j — 1, 

1 „ fl 



^eL.HV,) <^^>' " V2 - V2..{w-i} 



inf^ IIP . (2^^V,) -^11 < \\^- \\VA\q,n < ^max ^ \\V 
These inequalities result in 



inf IIP ^ (2^^VA -g\\ < e\\P ~ (2^^y,-) |L 
seLatt y^.) 
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which shows the existence of a vector ge in Latt(V"£) such that 



< e, for i = 1,2, . . . j — 1. Because the relation 

\\{2^^Vj) -gellQ^N ■ \\ge\\Q,N 

Vj G Latt(\7]^, V2, • • • , Vj-i, 2'^^V j) with the frequency Kj of the substitution Vj 2vj 
made already for Vj is always guaranteed at any step of the iteration of Q2, this fact 
implies that the substitution Vj 2vj in the iteration of Q2 in Table |2] cannot 
be repeated more than Kj + 1 times for Vj. Since the procedures other than the 

D 

substitution Vj 2vj do not increase sq^n{vi, . . . ,vd) '■= ll'^rllg.Ar, at any step of 

r=l 
D _ 

the iteration of Q2, s^/^^...^^^ is always bounded by s := 2'^''^^ 11^^11 q, at- Moreover, 

any process in Q2 gives a vector Vj in Latt(V^i, . . . , Fd). Hence, at any step of 

the iteration of Q2, {vi, . . . , vd) belongs to Tq^j^iVi, . . . ,V o) which is a finite set, 

where T^\ui, . . . , Un) has been defined in Definition 19.21 Similarly to this, because 
any procedure in Ql does not increase sq^n{vi, . . . ,vd) and gives a vector Vj in 
La.tt{F-j^l, . . . , F-j^^), at any step in the iteration of Ql, {vi, . . . ,vd) belongs to the 

D 

finite set f^^MZ • • • , ^/n?^) with a := SQ,r,{Fl^l, F^?)) = ^ 

r=l 

Then, the procedures in Table [2] except for finitely many (not greater than Kj + 1 
for each j) times of the execution of vj ^ 2vj {j = 2,3, ... , D) do not increase 

■So n{vi, . . . , Vd) and the substitutions Vj Vj — '^'^i' j,^ g^j^^ q2 qI_ 

ways decrease sq^n{vi, . . . , vd), unless Vj is changed. Since the sets Tq'^^iyi, . . . ,V £,) 

and Tq^^{F^I}, . . . , F^^^) are finite, this process is also carried out only finitely many 
times. Hence, the procedures in Table [2] halt in a finite number of steps. ■ 

Remark 9.1 In spite of the complications of the above proof of halt, the 
amount of calculations required for the iterations of the substitution pro- 
cesses in Table [2] is much smaller than the amount of calculations required 
for the calculations of the inner products themselves; this has been observed 
empirically. Hence, the iterations of the substitution processes in Table [2] 
are not 'bottlenecks' of our method at all, though proof of their halting is 
'logically' necessary. 

Even for the cases when dim V^n£^(Z"'") < 2, we can prove the halting of the processes 
in Table m and Table [5] for the process Step 3' in a similar manner to this, with some 
modifications, because the basic structure of the procedures is almost the same as 
for the one- dimensional case. Here, we omit it because of the complexity of the 
notations. 



Practical integer-type algorithni for higher-order differential equations 35 

10 Upper bound on errors 

In numerical methods, it is very important to know the precision of the results. Here 
we will give an error bound for our method. 

For this, we begin with two lemmata, with R as in f|T5|) and 

Ak '■= sup — ^^fW^ which is the upper limit of the truncation error, 

/evn£2(z+)\{o} ||ni^/||^2 
normalized in the subspace < eo,ei, ... ck >■ 

Lemma 10.1 Assume that diml^n£^(Z+) = 1. For a nonzero vector w in UnV , 
let W := {atd\ a G C} and U be a subspace of UnV such that H^V = W (B U 

and sup ^ n ^ — ^- Moreover, let Ci := and Ti : = 

u&u\{o} \ \i^\\e^,K ■ |p||^2^^ ||w||£2_^ 

inf 1^1^. // Ti > ^Ci and {f,w)e,K forfe VrM\Z+)\{0}, then 
ueu\{o} \\u\\£2^j^ 

inf IIIIxU^ — ^1|£2 ^ ^ ^ 

This theorem can be generalized to the cases where dim V fl £^(Z^) > 2, as follows: 

Lemma 10.2 Let PYiM(vr\£^{i+)) be the orthogonal projector on UnV to IlNiV fl 
£^(Z+)) with respect to (■, ■)^2 With a positive integer D not greater than diml^ fl 
£^(Z+) and D linearly independent vectors wi, W2, . . . , in Wfj :=< Wi, W2, . . . , wg > 
and Uj=) be a subspace of UnV such that UnV = Wg © and 

sup Jf ^' ^'^^1 ^^i^ < ^. Moreover, letCjj := sup || ^H*^'^ and 

■weWg\{o},ueu^\{o} \ \M\e^,K ■ \ \u\\e2^K wew^\{o} IrIU^x 

:= ^ mf // > and ((1 - Pn^ivniHz+))IlNV) n = {0}, 

u£U^\{0} \\u\\i2^K 

then 

inf llnifW — q\\i2 

Proof of -Lemma no. 21 Let w he a. nonzero vector in Wj^ (c UnV). Since 
PuN(vm^ iz+))'w G nAr(V^ n £^(Z+)), there exists a vector go in V H £^(Z+) such that 
PuN(vm\z+))W = Uxgo- From the definition of Ak, 

\\Ukw - go\\i2 = \\UKiw - go)\\e2^K+\\{l -]lK)go\\e2 

< 11""^ - PuNiVnfi(Z+))W\\£2^K-\-AK\\go\\£2^K 

= 11(1 - PuMivr\e2{z+)))w\\e2^K + AK\\PnN{vr\fi{z+))w\\e2^K 

< 11(1 - PuNivne2(z+)))w\\e2,K + AK\\w\\i2^K- 



sup — - — < + 1 + t; TTT- ■ 
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Since ||w||£2^^ = ||ni^w||£2, therefore, proof of the statement 

11(1 - Pu,^(vneHz+)))w\\e2^K ^ + RAk + 1 

^"P r^i - r 

weWg\{o} \\'^\\e^,K i- D~ s^D 

suffices. In the following parts of this proof, we show this statement. 

Let Pw- be the orthogonal projector to with respect to (•, ■)e2^K, and let 
:= (1 - Pw^)TInV. If ((1 - Pu^ivnPiz+))) TInV) nWj^ ^ {0}, then U^iV n 
£^(Z"'")) + W~ = UnV because the orthogonal complements (in UnV) of (1 — 
Pn,Y(yn£2(z+))) IlivV^ and Wg are IlNiV fl £^(Z"'~)) and W~, respectively, with re- 
spect to (■, ■)i'2^K- Hence, for any nonzero vector w in W^, there exist vectors 
V e njv(^ n and e such that w = v + w^. Then w = Pw^v. 

Since {w^,w)i2^K = 0, the inequalities ||-y||£2 ^ > ||'W^||£2 > hold. 

U V & Wfj, then w-^ = and w = v. This results in 
11(1 — -Pnjv(vn£2(z+)))w 11^2 < 11(1 — Pii)w\\£2^K = 0, which satisfies the statement of 
the lemma obviously. Therefore, in the following, we prove the lemma for the cases 

From the condition UnV = Wjj © Uj^, there exists vectors w G Wj^ and u G Uj^ 

such that V — w -\-u and — J^'^' "^^^M ^ ^ ^ ^ Hence, 

" '"\e'^,K ■ ll^ll^^i^ 



\\Pw-u\\p^K = — ^ C From the assumption that v W^, the 

vector u is not 0. Hence, the ratios z :— — — and a :— — — — are well- 

defined. Then, the inequality d < 1 holds because —w-^ is the perpendicular from v 
to W^. Since w — w — Pw^{w — v) — Pw^u, the above inequality results in 

Z^ ^ ' <^. 

The trigonometric inequality lltillQ^jv < ||'iy||Q,Ar + ||'J^||Q,Ar and the definitions of 
and result in the inequalities 

^olMi^K < C£,\\w\\(,2^K + [R^K + l)\\v\\£2,K 

< C^\\W - w\\i2^K + C'5||W||£2_;^ + {RAk + l)||v||£2^;^ , 

K oo 



r. 

K 

E 



because < ^ -^^^ = {RAk + 1) . Hence, 

\\vh2K ^ 



n=0 
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^oW'^h^K < zC^\\u\\e2^K + {C£, + RAK + l)\\v\\e^,K, because H^H^a^^ < ||t/||£2^;^. From 
the condition — > and the inequahty z < ^ above, F^ — zCj^ > 0. Hence, 

\\W \\£2,K= d\\u\\i2^K< \\u\\i2^K< — \\V\\(.2^K< 7\ TT^ FIU^K, 

i.e. Ilf^ll'--^ < + where , > i= guaranteed as has been shown 

above. 

Let Pij be the orthogonal projector to the subspace 5*^; := {av\a G C}. Then 
from a geometrical comparison of length among the perpendiculars, the relation 
||(l-Pn (vn.2(z.)))^Z;||...^ < = \\$^ holds, because 



5*^ C IIn{V n £^(Z+)) and w {= v + uJ-^) is the orthogonal projection of v to the 
subspace S*^ := {atv \ a E C} with respect to (■, ■)£2^k as well as the projection Pw-v 
to W^. 

TT ■ r, \\{'^ - Pii^{vnP(z+)))w\\t2^K , + RAk + 1 , , , r 
Hence, the mequahty r—r < — — — holds for 

any 

w G Wf^. The lemma has been proved by the combination of this fact and the 
discussion in the first part of this proof. ■ 

The proof of Lemma [10. II is just the proof above with D = 1. 

These propositions lead to the following theorem, which gives the upper bound 
of the error: 

Theorem 10.1 Let -Pn]v(yn^2(z+)) be the orthogonal projector on YInV to liNiV H 
with respect to {■r)p,K- Let T :=< G G^^), . . . , ^(^.2) > (= T^^^ m 

Section^) and R :=< -u^^*^^, ■u^'^^^^, . . . , ■u^^^^'^ > (= Rd^2 Section\^) be 
subspaces o/HatV^ such that UnT = W(BR and suppose that the quasi- orthogonalities 
(a), (6) and (d) in Lemma 7.1 are satisfied for d = Dp . Moreover, let 



C := sup ' ^11'^'^ and F := inf ^ J*^'^ . If ^ > ^(g) C with 

w(^T\{Q} W\\P,K «G-R\{0} ||-U||^2^^ 



9 



\ 



DDp 




9 

and ((1 — -Pnjv(vn^2(2+))) UnV) fl Wt = {0}, then the inequality 

inf \\I{KW — g\\i2 ^ 
9^vnP(zW^ ^ ^ C + l ^ R X ^ 

^enw ||nxw||£2 -v-s,{g)c \ v-i{g)c) ^ 

holds. 
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Proof of Theorem 110.11 From Lemma 18.11 and the quasi-orthogonalities (b) and 



(d), with n = D — D^, Ur 



-*<D,2> 



the inequahty 



K 



H\p,k\\G^'^\\ 




D — Dp) and v 



holds for any vector in w G i? 



and for £ = 1,2,...D^2. Using Lemma [HH] again with n = D12 and Ur. 



Q{m) 



w\\p^K\\u\\e^,K 
R and C, = C,{g), we have the 



(m = 1, 2, . . . , -0^2 ), we have the inequahty sup 

weR\{o},ueT\{o} 
Hence, from Lemma 110.21 with Wj=^ = T, Vg 
statement of the theorem ■ 
The condition ((1 — PnN{vne'^{z+))) ^nV) H T = {0} is satisfied except for very 
special cases with 'artificially bad' approximation where a linear combination of 
obtained vectors IIkG^^\IIkG^'^\ . . . ,IIkG^^''^^ for solutions is orthogonal to the 
space Vr\i'^{Z'^) of true solutions (!) with respect to (■,-)^2^^. Since the algo- 
rithm is used under guaranteed convergence of any nonzero linear combination of 
IIkG^^\ IIkG^^\ • • • 5 IlKG^^t'^\ which has been shown in the last part of Section [8|, 
we may neglect this condition in a practical sense. 

For sufficiently large g, we can choose very small Cio)- Moreover, there is a 
method to obtain an upper limit of C and a lower limit of F using only the numerical 
results, without any knowledge about the true solutions of the differential equation, 
as follows: 

The use of Lemma [GTT] with U = UnV, (■> Oa = {-r)Q,N, Oh = {■i-)p,k and 
fm = G*-"^) (m = 1, 2, . . . Dp) leads us to the inequality 



C < 



\ 



D 



'12 



E 

m=l 



|GM| 



9 



D 



On the other hand, the use of Lemma 16.11 with 



V 



D 



U = li^V, (■,-)y 



9 



■)s = {■,-)q,n and 



Ur 



^<D,2> 



[m 



F > 



1,2, 



D — D£2 ) leads us the inequality — < 



D-D 



i2 



E 

m=l 



II ^<D^2>|| 
\\Um \\P,K 

\\Um \\q,N 



D-D 



^<D,2+l>^ 



and hence 



\ 



D- Dp-l 
Ji 



D-D, 



E 



I -.<i?^2>|| 
\'^rn \\P,K 



In these bounds, the factors 1 



h 
Dp 



9 



and 1 



D - Dp 



^<Df2>\ 



D 



D-1 



\N 



- are nearly equal to 1, for sufficiently large g and h. In the usual 
situations with F >> C and ^((7) << 1 mentioned below, the condition T—^{g) C > 
is satisfied. 
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These upper and lower bounds of C and F, respectively, can be calculated numer- 
ically in the algorithm if we iterate Step 3. A up to = Di2. Hence, this theorem 
gives an upper bound for the L^^^-norm of the error for any non-zero linear combi- 



K 



n=0 



nation of the numerical solutions /ipp,(x) := ^^(nxG^''-')„e„(a;) (d = 0, . . . , -D^a), as 
C+1 / R - 



C -\- 1 / R \ 
the function , , ^ + ( H , , ^ ) of the 'worst magnitude of truncation 



error' Ak- 

In usual circumstances where the algorithm gives good convergence and with 

sufficiently large g, the parameters satisfy r>>i?>>C>>l and ^{g) « 

C 

1. Hence, the above upper bound is approximated roughly by — + Ak, which is 
approximately bounded by 



\\r<{m)\\ 



, m=l II 




11 Numerical results 

In this section, we will give some numerical results of the proposed method. In 
Subsection Ill.H in order to show how accurate the results are, we will treat by 
intent some examples of ODEs which can be solved analytically because there we 
can compare the results with exact solutions up to arbitrary precision, though the 
proposed method can be widely applied to ODEs which can not be solved analytically 
at all. These results contain some examples where we are successful in attaining the 
accuracy with several hundreds or several thousand significant digits by an ordinary 
personal computer. 

In Subsection \11.2\ we will compare theoretically the accuracy and the amount 
of calculation between some typical existing methods and the proposed method. 
However, direct numerical comparisons between the proposed method and existing 
methods are very difficult because usual existing methods with arbitrary-precision 
arithmetic (GNU Multi-Precision Library, for example) often require an astronom- 
ically large amount of calculations in order to attain such an extraordinarily high 
accuracy. Therefore, we will compare only the order of amount of calculations nec- 
essary for attaining a very large number of required significant digits. 



11.1 Numerical results by the proposed method 

In another paper |4j, we have already shown how extraordinarily accurate results 
the proposed method gives for Weber's differential equation (Schrodinger equation 
for harmonic oscillators) whose basic solutions in L2(M) are Hermite functions, and 
there we have given another example of a third order ODE and an example of the 




associate Legendre differential equation. In the former example of that paper, we 
were successful in obtaining the results where the ratio f{xo)/f{xi) between values 
of solution function /(x) coincides up to 2599 digits with the true ratio and the raio 
fn/ fn' between coefficients /„ {n = 0,1,...) in the expansion f{x) = J2n fn^n{x) 
coincides up to 8783 digits with the true ratio. There we observed that the number 
of significant digits are almost proportional a power of the dimension + 1 when 
the dimension Is very large. This implies that the amount of required calculations 
increases almost in a polynomial order of the number of required significant digits, 
from the reasons shown in Subsection 111.21 below. 

In this section, we will give two other examples than those. One is an example 
where we are successful in obtaining perfectly exact ratios between coefficients 
(n = 0, 1, . . . , i^) of the solution /(x), and the other example is for an ODE whose 
true solutions are weighted associate Laguerre functions. 

First, we will give numerical results for the second-order differential equation 



{9x^ -6x + 5)f" + (90a; - 30)/' + 126/ = 0. 



Practical integer-type algorithni for higher-order differential equations 

C(3x- 1) 



41 



The space of its true solutions in L 



IS 



((3a; -1)2 + 4)' 



C e C Kor A;n <3. 



In Figure 3, the resuhs with ko = 2,N+l = 18, 2A K = 2 [^J +ko and J = 2 [^J +k 



are shown, under the normahzation (/, ^{ipkofl + 4'ko-ko~i))H 
result with + 1 = 24 is almost invisible there. 



16 J ' '"0 

1. The error of the 



A^+1 


ratio Y' 


value of Re ^ 
Jo 


/2 

value of Im — 
Jo 


number of 
significant digits 


18 


-59 + 31Z 
33 


-1.78787878... 


+0.93939393 . . . 


1 





24 


-2051 + 1976 i 
1381 


-1.48515568... 


+ 1.43084721 . . . 


2 


2 


30 


-2249 + 2192 i 
1520 


-1.47960526... 


+1.44210526. . . 


4 


3 


36 


-2182 + 2126 i 
1475 


-1.47932203. . . 


+1.44135593... 


6 


5 


48 


-42251 + 41166 i 
28561 


(perfectly exact) 
-1.47932495. . . 


(perfectly exact) 
+ 1.44133608... 


oo 


oo 


true 


-42251 + 41166 i 
28561 


-1.47932495. . . 


+ 1.44133608... 







/2 

Table 6: Numerical results of the ratio — 

Jo 



Moreover, we have investigated within how many digits the ratio between two co- 
efficients fn and fn' in the expansion /(x) = fn^nix) coincides with the true ratio. 
For this differential equation, all the true ratios are rational-complex-valued. With 

k = 2, true ratios are A = 1 and A = + for example, which were 

k f« 28561 

obtained analytically using the computer algebra software system "Mathematica" . 

As is shown in Table 6, for Re the number of significant digits increases monoton- 

Jo 

ically as increases. Moreover, the results with A^ + 1 =48 have attained the exact 

. -42251 + 41166 z , , . 

true ratio , where the ratio exactly coincides with the true value. 

28561 

In this example, with A^ + 1 = 48, the other ratios among /„ (n = 0, . . . , 47) are all 
exact. The results with A^ + 1 > 48 have also yielded this perfectly exact true ratio. 
Since the coefficients /„ {n G Z^) decay almost exponentially in this case as is shown 
in Table 7, we can easily attain the accuracy of the solution function f{x) with more 
than one thousand significant digits. This exponential decay results from the rela- 
tionship between the Fourier series and the basis functions used in our method (men- 
tioned in Section |3]). For example, the numerical results of the ratio /(2)//(l) with 
A^ + 1 = 10000 coincides with the true ratio | (^)^ = 1.44779797562779150 . . .x 10"^ 
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n 


real part of /„ 


imaginary part of 





-6.75984000378. ..e-1 


±0 


1000 


-2.84538929863. . .e-271 


-9.83514249870. . .e-272 


2000 


+5.40708023241. ..e-550 


-7.83428979204. . .e-549 


3000 


+8.49092503337. ..e-827 


-1.66804004343. . .e-827 


4000 


+2.95369741917. ..e-1105 


+6.01269219021. . .e-1105 



Table 7: Almost exponential decay of the coefficients 

up to 1942 significant digits. 

For ODEs where the ratios between coefficients are irrational, we can not attain 
such perfectly exact ratios, because the results of our method are always rational- 
(complex-) valued. However, as has been shown in [Ij, the results have been successful 
in that very good rational approximations of the true irrational ratios were obtained, 
with 8783 significant digits as was mentioned above, for example. 

In the following, we will give the second example, for the Fuchsian-type ODE 

xf'ix) + fix) + (^-| + + ^) - ^) fix) = (33) 
with nonnegative integers fi and u. Since we have the Fuchsian-type ODE for (pix) := 

x(j)"ix) + (/i + 1 — x)(f)'ix) + i^fix) = 

which is the associate Laguerre differential equation, it is easily shown that the 
solutions in L^^^^(R) are proportional to the weighted associate Laguerre function 

with the associate Laguerre polynomial L^ix). From the last discussion in Section 
[21 the ODE fl33l) can be treated by our algorithm as the Fuchsian-type ODE 

x'fix) + xf'ix) + (-^ + (z/ + ^)x - fi'^ fix) = (34) 

whose coefficient functions are polynomials. In this ODE, the polynomial P2ix) = x^ 
of the highest order term has zero point at x = 0. However, as has been found in 
many other Fuchsian-type ODEs, numerical results always converge to true solutions 
empirically. So is this case, and here we show how the results converge to true 
solutions in this case. 

In FigH the results with ^ = 4, = 3, A;o = 0, + 1 = 200, 500, 
K = 2[^J + ko and J = 2[^J + ko are shown, under the normalization 
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Number of significant digits of f(9)/f(4) 



1 


1 1 1 1 1 1 _ 

true 


1 

1 


/ N+l=t)UU 


1 


1 \ / N+1=206 

I.I.I 





20 



10^ 



10' 



10 



10^ 



.0DE(1 1 .3), variable:u s.t. x=900u^ 
f(9)/f(4)=g(3/30)/g(2/30) , * 



A A 



* ^ 1dDE(11.2), variable:x 



10' 



10^ 
N+1 



Figure 4: Numerical results for pjg^^^ 5. Comparison of the num- 



ODE (Bl, i.e. for ODE (|33 



ber of significant digits of the ratio 
/(9)//(4) between ODE §^ and 
ODE (I35D 



(/, ^{ipkofi + V'A:o,-fco-i))'H = 1- The error of the result with + 1 = 500 is almost 
invisible there. It is remarkable that the obtained numerical results are very close to 
zero for a; < even though the basis functions are not small for x < 0, though a small 
oscillation is observed in the result with + 1 = 200. However, the convergence in 
this case is less rapid than the ODEs without zero points of pm{x), because of the 
singularity of the solution at a; = where the solutions are not (/i/2 + l)-th order 
differentiable. 

To avoid this problem, instead of ODE (!33|) . we solve the ODE 

u'^g'iu) + ug'{u) + (-cV + c^{Au + 2/i + 2)u^ - i?) g{u) = (35) 

with a positive constant c, which is derived directly from (!33|) by the change of 

coordinate x = {cuY (where g{u) := f {{cu)'^). By this change of coordinate, we 

can obtain the solutions only for x > 0, which causes no inconvenience because the 

true solutions are zero for x < 0. By this change of coordinate, the accuracy of the 

numerical result improves drastically, as is shown in Fig. where the number of 

/(9) 

significant digits of the ratio compared between the ODEs fl5^ and ([35]) 

with c = 30. As is shown in this figure, the number of significant digits empirically 
increases as a power of A^ for A^ + 1 > 10^ in the case of ODE fl35l) . while it increases 
approximately proportional to logA^ due to the singularity at x = in the case of 
(IMI) . The bad behavior for ODE is theoretically deduced also from the fact that 
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the order of coefficient /„ in the expansion /(x) = J2n fn^n{,x) is an inverse power of 
for a function f{x) with a singularity of this type, in a similar way to the case of 
Fourier series. The above change of coordinate eliminates this singularity, and it is 
successful in improving the accuracy to a great extent, up to several hundred digits, as 
is shown in Figure O Moreover, the number of significant digits increases almost in a 
power of when > 2000. This implies that the amount of required calculations 
increases almost in a polynomial order of the number of required significant digits, 
from the reasons shown in Subsection 111.21 below. 

11.2 Theoretical comparison with existing methods 

In this subsection, we compare theoretically the order of the amount of calcula- 
tions required for a very high accuracy, between some typical existing methods with 
arbitrary-precision arithmetic and the proposed method. The comparison is made 
with the following three methods: 

(a) Runge-Kutta methods with arbitrary-precision arithmetic. 

(b) Finite element methods with arbitrary-precision arithmetic. 

(c) Petrov-Galerkin method with arbitrary-precision arithmetic using the same glob- 

ally smooth basis functions as this paper. 

In the following, we compare them for the case where we require Q significant 
digits for the ratio /(xi)//(xo) between the values of a solution function at two 
points xq and xi. 

Order of amount of required calculations in the proposed method The amount 
of required calculations in the proposed method is almost a power of Q when 
Q is very large, because it requires about 0(A^'^(log A^)^) as was explained in 
Section O (after Theorem 15. 2p and many numerical resuts show that Q is almost 
proportional to a power of when A^ is very large. Empirically the amount of 
required calculations is from O(Q^) to 0{Q^) in many cases. (Moreover, from 
the discussion in Section [5|, we may redice it to be from O(Q^) to 0{Q^) by 
means of the modification proposed after Theorem 15.21 ) 

Comparison with (a) Runge-Kutta methods As is shown in the followong. Method 
(a) requires more than an exponential order of Q. As is known well, the dis- 
cretization error of the Runge-Kutta methods is proportional to a power 
ip > 4) of the step size h for the discretization of the coordinate. (For exam- 
ple, in the common Runge-Kutta method, proportional to /i^.) This implies 

that the step size h should satisfy the inequality CqW < IQ^^ with a positive 

log 10 

constant Cq. This inequality implies the inequality log/i < Q + Ci 

P 

with another constant Ci. Since the numbers Ug of the steps between xq and 
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xi is ^ ° — —, the number of required steps should satisfy the inequahty 
1 10 

logn^ > — ^ — Q + C2 with another constant C2. Hence, > C3 (10) with a 

positive constant C3. Moreover, the number of required digits for the working 
precision is larger as Q is larger. (At least, it should be larger than Q.) These 

fact implies that the amount of required calculations is at least 0{Q ■ (10) p ). 
Can you imagine how huge is the amount when Q is several hundreds or several 
thousands? Hence, the proposed method requires a much smaller amount of 
calculation than the Runge-Kutta methods, for a very high accuracy. 

Comparison with (b) finite element methods As is shown in the following, also 
Method (b) requires more than an exponential order of Q. As is known well, 
the error due to finite-dimensional approximation in the finite element methods 
is proportional to a power of the support size of the finite elements, at least. 
For example, when the basis functions are piece-wise polynomials of degree q 
with support size d, the error is approximately proportional to d'^'^^ at least, 
which is easily shown by the Taylor expansions of the true solution f{x). Since 

the required dimension of the subspace is proportional to — — — and the 

d 

matrix is band-diagonal, from a discussion very similar to the above Runge- 

Q 

Kutta cases, the amount of required calculations is at least 0{Q ■ (10) "?+!). 
This will be huge when Q is very large. Hence, the proposed method requires 
a much smaller amount of calculation than the finite element methods, for a 
very high accuracy. 

Comparison with (c) Galerkin methods using globally smooth basis functions 

For Method (c), since the direct order comparison is difficult, here we only point 
out that it requires a very large amount of calculations. Let W be the number 
of required digits for the working precision, and + 1 be the dimension of 
the subspace. Then, since the matrix is band-diagonal, the amount of required 
calculations is 0{W'^N). (If we use fast multiplication algorithms, the Karat- 
suba algorithm for example, the order may decrease to some extent. However, 
in our numerical examples, we did not use such algorithms. If we use such 
algorithms, we can diminish the order to the same extent as that. Therefore, 
for simplicity, here we compare the order without such algorithms.) 

In Method (c), it is very difficult to calculate the eigenvector in a high accuracy 
because of a heavy 'cancelling' due to round-off errors, by the following reason, 
even if the exact eigenvalue is known. In this method, the matrix is band- 
diagonal with band width 2io + 1 [!]• When we calculate the elements /„ (n = 
0, 1, . . . , A^) of the solution vector /, with unknown intial values /o, /i, • • • /^o-i? 
we should determine these initial values so that the linear equations given by 
the bottom io rows of the matrix can be satisfied. This problem can be reduced 
a system of io inhomogeneous simultaneous linear equation represented by a 
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io X io matrix. 

However, this io x £9 matrix is usually very close to a singular matrix of rank 1, 
because of the most diverging component contained in the halfway of the cal- 
culations of /„ {n = + . . . N — Eq). In other words, with whatever initial 
values, the vector obtained in the halfway of the calculation are almost parallel 
to this diverging component, because this diverging component is dominant. 
Moreover, that io x matrix is more close to a singular matrix of rank 1, as 
the dimension + 1 is larger. 

Though the order of this approach is difficult to estimate because it depends 

W 

on the differential equation, anyway we should choose W so that — — — 

[to — 1)Q 

can be much larger than a monotonously increasing function of not smaller 
than 1, in order to avoid the above mentioned 'cancelling'. The total order 
estimation of this case and the comparison with the proposed method is one 
of future problem. 

Even if we use other methods for the calculation of the eigenvector, the Gaus- 
sian elimination or the diagonalization for example, the basic mathematical 
structure is the same as the above, and the problem due to the closeness to a 
linear dependence arises there. 

Even if we use the Galerkin methods using another type of globally smooth 
basis functions, the Hermite functions for example, the basic circumstance is 
still similar to this. 

Thus, the proposed method requires a much smaller amount of calculations than 
Runge-Kutta and finite element methods, in order to calculate the values of the 
solution function f{x) at a finite number of points in an extraordinarily high accuracy. 
This is the reason why we have many nurerical results by ordinary personal computers 
where Q is so large as several hundreds or several thousands. 



12 Conclusions 

We have explained how to realize the integer-type algorithm proposed in [¥] for 
linear higher-order differential equations, and proved that the realization proposed 
there satisfies the conditions given in C7 which are required for the convergence of 
numerical results to the true 'general solutions' in "H of the differential equations. 

We have proved that the method based on quasi-orthogonalization can obtain 
vectors in the quasi-optimal set (ck]v)~^[0' '^^a^Iv] "with fixed c. Moreover, we have 
provided some proofs about the extension for the case with dim fl £^(Z+) > 2 and 
the halting of the procedures. 

In addition, we have given a theoretical upper bound for the errors as a function 
only of the worst truncation error, in terms of one unknown parameter Ak (the worst 
truncation error of the exact true solutions) . 
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Numerical results have shown the precision of the proposed method. In addition, 

we have given an example which attains the exact ratios among the coefficients /„ 

in the expansion /(x) = fn^n{x). 

In the near future, we will compare the norm of actual errors in numerical results 

and the upper bound proposed in this paper, and investigate the tightness of this 

upper bound. Remaining problems include improvements of the algorithm to reduce 

the number of calculations and better choices of the bilinear form Q{f,g), and the 

orthogonality parameters h and g. 
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A Other orthogonality- like relations for ipk^n 

With respect to the usual L^-inner product 

/oo 
f{x) g{x) dx , 
-oo 

the orthogonality-like relation 



]^ (_l\n-n' (^^)' (\n^n'\<:k- 

4^= ^ ^ {k + n-n'y. {k + n' ~ny. ^' ' 



(otherwise) 
holds. With respect to another inner product 

dy 



(/,^?)|D|-:= / (J^fMiJ^gMyrj: 

J— CO \y\ 

(where J-" denotes the operator of Fourier transformation) , another type of 
orthogonality-like relation 

V n e Z+, (^fc,„,Vfc,n')|D|-fc = ^ (n + 2ky. 

holds. This relation is derived from the orthogonality of the number states associated 
with the algebra su(l, 1) [11]. Here note that (/, g)(^o) = (/, g)\D\-° = (/, g)- 
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B On the shape of the wavepackets ipk,n{x) 



The wavepackets are complex-valued 'wavy' functions. By the scale change x — >■ 
x/a/^, the 'envelope' function 



\i'Kn{x)\ = {x^+iy 



fc+1 



(36) 



of the wavepacket ip^ ^x) is proportional to the probability density function of the 
(Student) t-distribution with the degree of freedom k (used in statistics), which tends 
to the standard Gaussian function as /c — )■ oo. Since the 'foot of the mountain' of 
the probability density function of the t-distribution is thicker when k is smaller, the 
'localization' of the envelope of the wavepacket is better as k becomes larger when a 
comparison is made with respect to the normalized 'width' (or standard deviation). 
The 'phase' function of the wavepacket is 



arg iJk,n{x) 



{k + 2n + 1) (arctanx — — ) . 



(37) 



This function is almost linear when |x| is not too large (because then arctanx ~ x), 
and we can approximate the wavepacket by a sinusoidal wavepacket with the above 
envelope function, because the envelope function is sufficiently small where the phase 
function deviates considerably from the linear function {k + 2n + 1) {x — ^). This 
approximate picture is very good especially in the cases with large k though it is 
poor for the cases with small k, because the 'foot of the mountain' of the envelope 
vanishes very rapidly for large k. In fact, for sufficiently large k, 



fc+1 



or ^,,„(x) ^ e^C^^^'^^^)^ e-t^^ 



For this, we can prove easily that the function Ef^e'^^'''"^ ipk,ni—^) with 



tion 
1 



r(^) _ k + 2n + l 



7rv^r(f) 



and ik,n ■-- 



Vk 



converges to the standard Gaussian func- 



e 2^ as A; — )■ oo for the L^-norm, by means of the upper and lower bounds of 



the Stirling formula, because the properties fl5^ and fl?r|) lead us to 

1 



Vk^ 



2n 



ix2 

e 2^ 



(38) 



-i (fc+2n+l)(^ 



- arctan 



\ 



2-K 



:2 ^^2 



fc+1 



e 2^ . 
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This picture of the wavepackets, 'almost-sinusoidal' oscillations with a spindle- 
shaped envelope, which are similar to Gaussian-weighted (complex) sinusoidal 
wavepackets, is 'natural' and useful in many applications, and it is very convenient 
for the interpretation of the basis systems used in this paper (for this, the relationship 
between these wavepackets and the Fourier series is shown in Appendix B of [5j). 
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